| Non-Rationalised NCERT Books Solution | ||||||
|---|---|---|---|---|---|---|
| 6th | 7th | 8th | 9th | 10th | 11th | 12th |
Chapter 15 Statistics
Welcome to the solutions for Chapter 15: Statistics. While previous encounters with statistics likely focused on summarizing data using measures of central tendency (like mean, median, and mode), which describe the 'typical' value within a dataset, this chapter delves into another crucial aspect of data analysis: understanding its dispersion or variability. Central tendency measures alone provide an incomplete picture. For instance, two datasets might have the exact same mean but differ vastly in how spread out their values are. One set might cluster tightly around the mean, while the other might have values scattered widely. Measuring this spread, or dispersion, is essential for comprehending the distribution's nature, assessing consistency, comparing different datasets reliably, and making informed inferences. This chapter introduces several key statistical tools designed specifically to quantify the extent to which data points deviate from the average or spread out across the range of observations. We will explore methods applicable to both ungrouped (raw) data and grouped data presented in frequency distributions.
The solutions explore various measures of dispersion, starting with the simplest and progressing to more robust and widely used metrics:
- Range: This is the most basic measure, calculated simply as the difference between the maximum and minimum values in the dataset ($Range = Max \: Value - Min \: Value$). While easy to compute, the solutions note its significant limitation: it depends solely on the two extreme values and is highly sensitive to outliers, potentially giving a misleading impression of the overall data spread.
-
Mean Deviation (MD): This measure provides a better sense of the typical deviation by considering all observations. It calculates the average of the absolute differences between each observation and a central value, usually the mean or the median. The absolute value is used to ensure deviations don't cancel each other out. Solutions demonstrate calculating:
- Mean Deviation about the Mean: $MD(\bar{x}) = \frac{\sum\limits |x_i - \bar{x}|}{n}$ (ungrouped) or $MD(\bar{x}) = \frac{\sum\limits f_i|x_i - \bar{x}|}{N}$ (grouped, where $N = \Sigma f_i$).
- Mean Deviation about the Median: $MD(M) = \frac{\sum\limits |x_i - M|}{n}$ (ungrouped) or $MD(M) = \frac{\sum\limits f_i|x_i - M|}{N}$ (grouped).
-
Variance ($\sigma^2$) and Standard Deviation ($\sigma$): These are the most important and commonly used measures of dispersion in statistics. They are based on the squared deviations from the mean, which gives greater weight to larger deviations and avoids the issue of absolute values present in MD. The variance is the average of these squared deviations. The standard deviation, which is the positive square root of the variance ($\sigma = \sqrt{\sigma^2}$), is particularly useful because it is expressed in the same units as the original data, making it more interpretable than variance. Solutions detail the calculation using the definitional formulas:
- Ungrouped Data Variance: $\sigma^2 = \frac{\sum\limits (x_i - \bar{x})^2}{n}$
- Grouped Data Variance: $\sigma^2 = \frac{\sum\limits f_i(x_i - \bar{x})^2}{N}$
Beyond calculation, the solutions emphasize the interpretation of these measures. For instance, they cover the analysis of frequency distributions that might share the same mean but exhibit different variances, illustrating how standard deviation effectively quantifies the consistency or spread within each dataset – a smaller $\sigma$ indicates data points are clustered more closely around the mean (more consistent), while a larger $\sigma$ signifies greater variability. For comparing the relative variability of two or more datasets, especially if they have different means or different units, the Coefficient of Variation (CV) is introduced. It's a unit-less measure calculated as $CV = \left(\frac{\sigma}{\bar{x}}\right) \times 100\%$. A lower CV indicates greater consistency relative to the mean. These tools provide a far more comprehensive understanding of data characteristics than central tendency alone.
Example 1 to 7 (Before Exercise 15.1)
Example 1: Find the mean deviation about the mean for the following data:
| 6 | 7 | 10 | 12 | 13 | 4 | 8 | 12 |
Answer:
The given data is: 6, 7, 10, 12, 13, 4, 8, 12.
The number of observations is $n = 8$.
First, we find the mean of the data.
Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}}$
Sum of observations = $6 + 7 + 10 + 12 + 13 + 4 + 8 + 12 = 72$
$\overline{x} = \frac{72}{8} = 9$
The mean of the data is 9.
Next, we find the absolute deviation of each observation from the mean, i.e., $|x_i - \overline{x}|$.
$|6 - 9| = |-3| = 3$
$|7 - 9| = |-2| = 2$
$|10 - 9| = |1| = 1$
$|12 - 9| = |3| = 3$
$|13 - 9| = |4| = 4$
$|4 - 9| = |-5| = 5$
$|8 - 9| = |-1| = 1$
$|12 - 9| = |3| = 3$
Now, we find the sum of the absolute deviations.
$\sum\limits |x_i - \overline{x}| = 3 + 2 + 1 + 3 + 4 + 5 + 1 + 3 = 22$
Finally, we calculate the mean deviation about the mean.
Mean Deviation about the mean = $\frac{\sum\limits |x_i - \overline{x}|}{n}$
MD($\overline{x}$) = $\frac{22}{8} = \frac{11}{4} = 2.75$
The mean deviation about the mean for the given data is 2.75.
Example 2: Find the mean deviation about the mean for the following data :
| 12 | 3 | 18 | 17 | 4 | 9 | 17 | 19 | 20 | 15 |
| 8 | 17 | 2 | 3 | 16 | 11 | 3 | 1 | 0 | 5 |
Answer:
The given data is:
12, 3, 18, 17, 4, 9, 17, 19, 20, 15, 8, 17, 2, 3, 16, 11, 3, 1, 0, 5.
The number of observations is $n$. By counting the data points, we have $n = 20$.
First, we find the mean of the data ($\overline{x}$).
$\overline{x} = \frac{\text{Sum of observations}}{\text{Number of observations}} = \frac{\sum\limits x_i}{n}$
Sum of observations ($\sum\limits x_i$) = $12 + 3 + 18 + 17 + 4 + 9 + 17 + 19 + 20 \ $$ + 15 + 8 + 17 + 2 + 3 + 16 + 11 + 3 + 1 + 0 + 5 = 200$
$\overline{x} = \frac{200}{20} = 10$
The mean of the data is 10.
Next, we find the absolute deviation of each observation from the mean, i.e., $|x_i - \overline{x}| = |x_i - 10|$.
$|12 - 10| = 2$
$|3 - 10| = 7$
$|18 - 10| = 8$
$|17 - 10| = 7$
$|4 - 10| = 6$
$|9 - 10| = 1$
$|17 - 10| = 7$
$|19 - 10| = 9$
$|20 - 10| = 10$
$|15 - 10| = 5$
$|8 - 10| = 2$
$|17 - 10| = 7$
$|2 - 10| = 8$
$|3 - 10| = 7$
$|16 - 10| = 6$
$|11 - 10| = 1$
$|3 - 10| = 7$
$|1 - 10| = 9$
$|0 - 10| = 10$
$|5 - 10| = 5$
Now, we find the sum of the absolute deviations.
$\sum\limits |x_i - \overline{x}| = 2 + 7 + 8 + 7 + 6 + 1 + 7 + 9 + 10 + 5 + 2 + 7 + 8 + 7 \ $$ + 6 + 1 + 7 + 9 + 10 + 5 = 124$
Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).
MD($\overline{x}$) = $\frac{\sum\limits |x_i - \overline{x}|}{n} = \frac{124}{20}$
MD($\overline{x}$) = $6.2$
The mean deviation about the mean for the given data is 6.2.
Example 3: Find the mean deviation about the median for the following data:
| 3 | 9 | 5 | 3 | 12 | 10 | 18 | 4 | 7 | 19 |
| 21 |
Answer:
The given data is: 3, 9, 5, 3, 12, 10, 18, 4, 7, 19, 21.
The number of observations is $n = 11$.
First, we need to arrange the data in ascending order to find the median.
Arranged data: 3, 3, 4, 5, 7, 9, 10, 12, 18, 19, 21.
Since the number of observations ($n = 11$) is odd, the median (M) is the value of the $\left(\frac{n+1}{2}\right)^{\text{th}}$ observation.
Median (M) = $\left(\frac{11+1}{2}\right)^{\text{th}}$ observation = $6^{\text{th}}$ observation.
From the arranged data, the $6^{\text{th}}$ observation is 9.
So, the median M = 9.
Next, we find the absolute deviation of each observation from the median, i.e., $|x_i - M| = |x_i - 9|$.
$|3 - 9| = |-6| = 6$
$|3 - 9| = |-6| = 6$
$|4 - 9| = |-5| = 5$
$|5 - 9| = |-4| = 4$
$|7 - 9| = |-2| = 2$
$|9 - 9| = |0| = 0$
$|10 - 9| = |1| = 1$
$|12 - 9| = |3| = 3$
$|18 - 9| = |9| = 9$
$|19 - 9| = |10| = 10$
$|21 - 9| = |12| = 12$
Now, we find the sum of the absolute deviations.
$\sum\limits |x_i - M| = 6 + 6 + 5 + 4 + 2 + 0 + 1 + 3 + 9 + 10 + 12 = 58$
Finally, we calculate the mean deviation about the median (MD(M)).
MD(M) = $\frac{\sum\limits |x_i - M|}{n} = \frac{58}{11}$
MD(M) $\approx 5.27$ (approximately)
The mean deviation about the median for the given data is $\frac{58}{11}$ or approximately 5.27.
Example 4: Find mean deviation about the mean for the following data :
| $x_i$ | 2 | 5 | 6 | 8 | 10 | 12 |
| $f_i$ | 2 | 8 | 10 | 7 | 8 | 5 |
Answer:
The given data is a discrete frequency distribution:
| $x_i$ | $f_i$ |
| 2 | 2 |
| 5 | 8 |
| 6 | 10 |
| 8 | 7 |
| 10 | 8 |
| 12 | 5 |
To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).
We calculate $f_i x_i$ for each class and the total frequency $\sum\limits f_i$ and the sum $\sum\limits f_i x_i$.
| $x_i$ | $f_i$ | $f_i x_i$ |
| 2 | 2 | $2 \times 2 = 4$ |
| 5 | 8 | $8 \times 5 = 40$ |
| 6 | 10 | $10 \times 6 = 60$ |
| 8 | 7 | $7 \times 8 = 56$ |
| 10 | 8 | $8 \times 10 = 80$ |
| 12 | 5 | $5 \times 12 = 60$ |
| $\sum\limits f_i = 40$ | $\sum\limits f_i x_i = 300$ |
The mean ($\overline{x}$) is calculated as:
$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i}$
$\overline{x} = \frac{300}{40} = \frac{30}{4} = 7.5$
The mean of the data is 7.5.
Next, we calculate the absolute deviation of each observation from the mean, $|x_i - \overline{x}| = |x_i - 7.5|$, and the product $f_i |x_i - 7.5|$.
| $x_i$ | $f_i$ | $|x_i - 7.5|$ | $f_i |x_i - 7.5|$ |
| 2 | 2 | $|2 - 7.5| = 5.5$ | $2 \times 5.5 = 11.0$ |
| 5 | 8 | $|5 - 7.5| = 2.5$ | $8 \times 2.5 = 20.0$ |
| 6 | 10 | $|6 - 7.5| = 1.5$ | $10 \times 1.5 = 15.0$ |
| 8 | 7 | $|8 - 7.5| = 0.5$ | $7 \times 0.5 = 3.5$ |
| 10 | 8 | $|10 - 7.5| = 2.5$ | $8 \times 2.5 = 20.0$ |
| 12 | 5 | $|12 - 7.5| = 4.5$ | $5 \times 4.5 = 22.5$ |
| $\sum\limits f_i = 40$ | $\sum\limits f_i |x_i - 7.5| = 92.0$ |
Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).
MD($\overline{x}$) = $\frac{\sum\limits f_i |x_i - \overline{x}|}{\sum\limits f_i}$
MD($\overline{x}$) = $\frac{92.0}{40} = \frac{92}{40} = \frac{23}{10} = 2.3$
The mean deviation about the mean for the given data is 2.3.
Example 5: Find the mean deviation about the median for the following data:
| $x_i$ | 3 | 6 | 9 | 12 | 13 | 15 | 21 | 22 |
| $f_i$ | 3 | 4 | 5 | 2 | 4 | 5 | 4 | 3 |
Answer:
The given data is a discrete frequency distribution:
| $x_i$ | $f_i$ |
| 3 | 3 |
| 6 | 4 |
| 9 | 5 |
| 12 | 2 |
| 13 | 4 |
| 15 | 5 |
| 21 | 4 |
| 22 | 3 |
To find the mean deviation about the median, we first need to calculate the median (M).
We calculate the total frequency $N = \sum\limits f_i$ and the cumulative frequencies (c.f.).
| $x_i$ | $f_i$ | Cumulative Frequency (c.f.) |
| 3 | 3 | 3 |
| 6 | 4 | $3 + 4 = 7$ |
| 9 | 5 | $7 + 5 = 12$ |
| 12 | 2 | $12 + 2 = 14$ |
| 13 | 4 | $14 + 4 = 18$ |
| 15 | 5 | $18 + 5 = 23$ |
| 21 | 4 | $23 + 4 = 27$ |
| 22 | 3 | $27 + 3 = 30$ |
| $N = \sum\limits f_i = 30$ |
The total number of observations is $N = 30$, which is an even number.
For an even number of observations, the median is the average of the $\left(\frac{N}{2}\right)^{\text{th}}$ and $\left(\frac{N}{2} + 1\right)^{\text{th}}$ observations.
$\frac{N}{2} = \frac{30}{2} = 15^{\text{th}}$ observation.
$\frac{N}{2} + 1 = 15 + 1 = 16^{\text{th}}$ observation.
From the cumulative frequency table, the $15^{\text{th}}$ observation falls in the class where c.f. is 18, which corresponds to $x_i = 13$.
The $16^{\text{th}}$ observation also falls in the class where c.f. is 18, which corresponds to $x_i = 13$.
So, the median (M) = $\frac{13 + 13}{2} = 13$.
The median of the data is 13.
Next, we calculate the absolute deviation of each observation from the median, $|x_i - M| = |x_i - 13|$, and the product $f_i |x_i - 13|$.
| $x_i$ | $f_i$ | $|x_i - 13|$ | $f_i |x_i - 13|$ |
| 3 | 3 | $|3 - 13| = 10$ | $3 \times 10 = 30$ |
| 6 | 4 | $|6 - 13| = 7$ | $4 \times 7 = 28$ |
| 9 | 5 | $|9 - 13| = 4$ | $5 \times 4 = 20$ |
| 12 | 2 | $|12 - 13| = 1$ | $2 \times 1 = 2$ |
| 13 | 4 | $|13 - 13| = 0$ | $4 \times 0 = 0$ |
| 15 | 5 | $|15 - 13| = 2$ | $5 \times 2 = 10$ |
| 21 | 4 | $|21 - 13| = 8$ | $4 \times 8 = 32$ |
| 22 | 3 | $|22 - 13| = 9$ | $3 \times 9 = 27$ |
| $\sum\limits f_i = 30$ | $\sum\limits f_i |x_i - 13| = 149$ |
Finally, we calculate the mean deviation about the median (MD(M)).
MD(M) = $\frac{\sum\limits f_i |x_i - M|}{\sum\limits f_i}$
MD(M) = $\frac{149}{30}$
MD(M) $\approx 4.97$ (approximately)
The mean deviation about the median for the given data is $\frac{149}{30}$ or approximately 4.97.
Example 6: Find the mean deviation about the mean for the following data
| Marks obtained | 10-20 | 20-30 | 30-40 | 40-50 | 50-60 | 60-70 | 70-80 |
| Number of students | 2 | 3 | 8 | 14 | 8 | 3 | 2 |
Answer:
The given data is a grouped frequency distribution:
| Marks obtained (Class Interval) | Number of students ($f_i$) |
| 10-20 | 2 |
| 20-30 | 3 |
| 30-40 | 8 |
| 40-50 | 14 |
| 50-60 | 8 |
| 60-70 | 3 |
| 70-80 | 2 |
To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).
For grouped data, the mean is calculated using the midpoints of the class intervals.
Let $x_i$ be the midpoint of the $i$-th class interval and $f_i$ be the corresponding frequency.
Calculate the midpoints ($x_i$) and the product $f_i x_i$:
| Class Interval | Frequency ($f_i$) | Midpoint ($x_i$) | $f_i x_i$ |
| 10-20 | 2 | $\frac{10+20}{2} = 15$ | $2 \times 15 = 30$ |
| 20-30 | 3 | $\frac{20+30}{2} = 25$ | $3 \times 25 = 75$ |
| 30-40 | 8 | $\frac{30+40}{2} = 35$ | $8 \times 35 = 280$ |
| 40-50 | 14 | $\frac{40+50}{2} = 45$ | $14 \times 45 = 630$ |
| 50-60 | 8 | $\frac{50+60}{2} = 55$ | $8 \times 55 = 440$ |
| 60-70 | 3 | $\frac{60+70}{2} = 65$ | $3 \times 65 = 195$ |
| 70-80 | 2 | $\frac{70+80}{2} = 75$ | $2 \times 75 = 150$ |
| Total | $\sum\limits f_i = 40$ | $\sum\limits f_i x_i = 1800$ |
The mean ($\overline{x}$) is given by the formula:
$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i}$
$\overline{x} = \frac{1800}{40} = 45$
The mean of the data is 45.
Next, we calculate the absolute deviation of each midpoint from the mean, $|x_i - \overline{x}| = |x_i - 45|$, and the product $f_i |x_i - 45|$.
| Class Interval | $x_i$ | $f_i$ | $|x_i - 45|$ | $f_i |x_i - 45|$ |
| 10-20 | 15 | 2 | $|15 - 45| = |-30| = 30$ | $2 \times 30 = 60$ |
| 20-30 | 25 | 3 | $|25 - 45| = |-20| = 20$ | $3 \times 20 = 60$ |
| 30-40 | 35 | 8 | $|35 - 45| = |-10| = 10$ | $8 \times 10 = 80$ |
| 40-50 | 45 | 14 | $|45 - 45| = |0| = 0$ | $14 \times 0 = 0$ |
| 50-60 | 55 | 8 | $|55 - 45| = |10| = 10$ | $8 \times 10 = 80$ |
| 60-70 | 65 | 3 | $|65 - 45| = |20| = 20$ | $3 \times 20 = 60$ |
| 70-80 | 75 | 2 | $|75 - 45| = |30| = 30$ | $2 \times 30 = 60$ |
| Total | $\sum\limits f_i = 40$ | $\sum\limits f_i |x_i - 45| \ $$ = 400$ |
Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).
MD($\overline{x}$) = $\frac{\sum\limits f_i |x_i - \overline{x}|}{\sum\limits f_i}$
MD($\overline{x}$) = $\frac{400}{40} = 10$
The mean deviation about the mean for the given data is 10.
Example 7: Calculate the mean deviation about median for the following data :
| Class | 0-10 | 10-20 | 20-30 | 30-40 | 40-50 | 50-60 |
| Frequency | 6 | 7 | 15 | 16 | 4 | 2 |
Answer:
The given data is a grouped frequency distribution:
| Class Interval | Frequency ($f_i$) |
| 0-10 | 6 |
| 10-20 | 7 |
| 20-30 | 15 |
| 30-40 | 16 |
| 40-50 | 4 |
| 50-60 | 2 |
To find the mean deviation about the median, we first need to calculate the median (M).
We calculate the cumulative frequencies (c.f.) and the total frequency $N = \sum\limits f_i$.
| Class Interval | Frequency ($f_i$) | Cumulative Frequency (c.f.) |
| 0-10 | 6 | 6 |
| 10-20 | 7 | $6 + 7 = 13$ |
| 20-30 | 15 | $13 + 15 = 28$ |
| 30-40 | 16 | $28 + 16 = 44$ |
| 40-50 | 4 | $44 + 4 = 48$ |
| 50-60 | 2 | $48 + 2 = 50$ |
| Total | $N = \sum\limits f_i = 50$ |
The total number of observations is $N = 50$. We need to find the class containing the $\left(\frac{N}{2}\right)^{\text{th}}$ observation.
$\frac{N}{2} = \frac{50}{2} = 25^{\text{th}}$ observation.
The cumulative frequency just greater than or equal to 25 is 28, which corresponds to the class interval 20-30.
So, the median class is 20-30.
For the median class (20-30):
Lower boundary (L) = 20
Frequency of the median class (f) = 15
Cumulative frequency of the class preceding the median class (c.f.) = 13 (c.f. of 10-20 class)
Class size (h) = $30 - 20 = 10$
The median (M) is calculated using the formula:
$M = L + \frac{\frac{N}{2} - c.f.}{f} \times h$
$M = 20 + \frac{25 - 13}{15} \times 10$
$M = 20 + \frac{12}{15} \times 10$
$M = 20 + \frac{4}{5} \times 10$
$M = 20 + 4 \times 2$
$M = 20 + 8$
$M = 28$
The median of the data is 28.
Next, we calculate the midpoints ($x_i$) of each class interval, the absolute deviation from the median $|x_i - 28|$, and the product $f_i |x_i - 28|$.
| Class Interval | $f_i$ | Midpoint ($x_i$) | $|x_i - 28|$ | $f_i |x_i - 28|$ |
| 0-10 | 6 | $\frac{0+10}{2} = 5$ | $|5 - 28| = |-23| = 23$ | $6 \times 23 = 138$ |
| 10-20 | 7 | $\frac{10+20}{2} = 15$ | $|15 - 28| = |-13| = 13$ | $7 \times 13 = 91$ |
| 20-30 | 15 | $\frac{20+30}{2} = 25$ | $|25 - 28| = |-3| = 3$ | $15 \times 3 = 45$ |
| 30-40 | 16 | $\frac{30+40}{2} = 35$ | $|35 - 28| = |7| = 7$ | $16 \times 7 = 112$ |
| 40-50 | 4 | $\frac{40+50}{2} = 45$ | $|45 - 28| = |17| = 17$ | $4 \times 17 = 68$ |
| 50-60 | 2 | $\frac{50+60}{2} = 55$ | $|55 - 28| = |27| = 27$ | $2 \times 27 = 54$ |
| Total | $\sum\limits f_i = 50$ | $\sum\limits f_i |x_i - 28| \ $$ = 508$ |
Finally, we calculate the mean deviation about the median (MD(M)).
MD(M) = $\frac{\sum\limits f_i |x_i - M|}{\sum\limits f_i}$
MD(M) = $\frac{508}{50} = 10.16$
The mean deviation about the median for the given data is 10.16.
Exercise 15.1
Find the mean deviation about the mean for the data in Exercises 1 and 2.
Question 1.
| 4 | 7 | 8 | 9 | 10 | 12 | 13 | 17 |
Answer:
The given data is: 4, 7, 8, 9, 10, 12, 13, 17.
The number of observations is $n = 8$.
First, we find the mean of the data.
Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}}$
Sum of observations = $4 + 7 + 8 + 9 + 10 + 12 + 13 + 17 = 80$
$\overline{x} = \frac{80}{8} = 10$
The mean of the data is 10.
Next, we find the absolute deviation of each observation from the mean, i.e., $|x_i - \overline{x}| = |x_i - 10|$.
$|4 - 10| = |-6| = 6$
$|7 - 10| = |-3| = 3$
$|8 - 10| = |-2| = 2$
$|9 - 10| = |-1| = 1$
$|10 - 10| = |0| = 0$
$|12 - 10| = |2| = 2$
$|13 - 10| = |3| = 3$
$|17 - 10| = |7| = 7$
Now, we find the sum of the absolute deviations.
$\sum\limits_{i=1}^{8} |x_i - \overline{x}| = 6 + 3 + 2 + 1 + 0 + 2 + 3 + 7 = 24$
Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).
MD($\overline{x}$) = $\frac{\sum\limits_{i=1}^{n} |x_i - \overline{x}|}{n}$
MD($\overline{x}$) = $\frac{24}{8} = 3$
The mean deviation about the mean for the given data is 3.
Question 2.
| 38 | 70 | 48 | 40 | 42 | 55 | 63 | 46 | 54 | 44 |
Answer:
The given data is: 38, 70, 48, 40, 42, 55, 63, 46, 54, 44.
The number of observations is $n$. By counting the data points, we have $n = 10$.
First, we find the mean of the data ($\overline{x}$).
$\overline{x} = \frac{\text{Sum of observations}}{\text{Number of observations}} = \frac{\sum\limits x_i}{n}$
Sum of observations ($\sum\limits x_i$) = $38 + 70 + 48 + 40 + 42 + 55 + 63 \ $$ + 46 + 54 + 44 = 500$
$\overline{x} = \frac{500}{10} = 50$
The mean of the data is 50.
Next, we find the absolute deviation of each observation from the mean, i.e., $|x_i - \overline{x}| = |x_i - 50|$.
$|38 - 50| = |-12| = 12$
$|70 - 50| = |20| = 20$
$|48 - 50| = |-2| = 2$
$|40 - 50| = |-10| = 10$
$|42 - 50| = |-8| = 8$
$|55 - 50| = |5| = 5$
$|63 - 50| = |13| = 13$
$|46 - 50| = |-4| = 4$
$|54 - 50| = |4| = 4$
$|44 - 50| = |-6| = 6$
Now, we find the sum of the absolute deviations.
$\sum\limits |x_i - \overline{x}| = 12 + 20 + 2 + 10 + 8 + 5 + 13 + 4 + 4 + 6 = 84$
Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).
MD($\overline{x}$) = $\frac{\sum\limits |x_i - \overline{x}|}{n} = \frac{84}{10}$
MD($\overline{x}$) = $8.4$
The mean deviation about the mean for the given data is 8.4.
Find the mean deviation about the median for the data in Exercises 3 and 4.
Question 3.
| 13 | 17 | 16 | 14 | 11 | 13 | 10 | 16 | 11 | 18 |
| 12 | 17 |
Answer:
The given data is: 13, 17, 16, 14, 11, 13, 10, 16, 11, 18, 12, 17.
The number of observations is $n$. By counting the data points, we have $n = 12$.
To find the mean deviation about the median, we first need to calculate the median (M).
We arrange the data in ascending order:
10, 11, 11, 12, 13, 13, 14, 16, 16, 17, 17, 18.
Since the number of observations ($n = 12$) is even, the median is the average of the $\left(\frac{n}{2}\right)^{\text{th}}$ and $\left(\frac{n}{2} + 1\right)^{\text{th}}$ observations.
$\frac{n}{2} = \frac{12}{2} = 6^{\text{th}}$ observation.
$\frac{n}{2} + 1 = 6 + 1 = 7^{\text{th}}$ observation.
The $6^{\text{th}}$ observation in the arranged data is 13.
The $7^{\text{th}}$ observation in the arranged data is 14.
Median (M) = $\frac{6^{\text{th}} \text{ observation} + 7^{\text{th}} \text{ observation}}{2} = \frac{13 + 14}{2} = \frac{27}{2} = 13.5$
The median of the data is 13.5.
Next, we find the absolute deviation of each observation from the median, i.e., $|x_i - M| = |x_i - 13.5|$.
$|10 - 13.5| = 3.5$
$|11 - 13.5| = 2.5$
$|11 - 13.5| = 2.5$
$|12 - 13.5| = 1.5$
$|13 - 13.5| = 0.5$
$|13 - 13.5| = 0.5$
$|14 - 13.5| = 0.5$
$|16 - 13.5| = 2.5$
$|16 - 13.5| = 2.5$
$|17 - 13.5| = 3.5$
$|17 - 13.5| = 3.5$
$|18 - 13.5| = 4.5$
Now, we find the sum of the absolute deviations.
$\sum\limits_{i=1}^{12} |x_i - M| = 3.5 + 2.5 + 2.5 + 1.5 + 0.5 + 0.5 + 0.5 + 2.5 + 2.5 \ $$ + 3.5 + 3.5 + 4.5 = 28$
Finally, we calculate the mean deviation about the median (MD(M)).
MD(M) = $\frac{\sum\limits_{i=1}^{n} |x_i - M|}{n}$
MD(M) = $\frac{28}{12} = \frac{7}{3}$
MD(M) $\approx 2.33$ (approximately)
The mean deviation about the median for the given data is $\frac{7}{3}$ or approximately 2.33.
Question 4.
| 36 | 72 | 46 | 42 | 60 | 45 | 53 | 46 | 51 | 49 |
Answer:
The given data is: 36, 72, 46, 42, 60, 45, 53, 46, 51, 49.
The number of observations is $n$. By counting the data points, we have $n = 10$.
To find the mean deviation about the median, we first need to calculate the median (M).
We arrange the data in ascending order:
36, 42, 45, 46, 46, 49, 51, 53, 60, 72.
Since the number of observations ($n = 10$) is even, the median is the average of the $\left(\frac{n}{2}\right)^{\text{th}}$ and $\left(\frac{n}{2} + 1\right)^{\text{th}}$ observations.
$\frac{n}{2} = \frac{10}{2} = 5^{\text{th}}$ observation.
$\frac{n}{2} + 1 = 5 + 1 = 6^{\text{th}}$ observation.
The $5^{\text{th}}$ observation in the arranged data is 46.
The $6^{\text{th}}$ observation in the arranged data is 49.
Median (M) = $\frac{5^{\text{th}} \text{ observation} + 6^{\text{th}} \text{ observation}}{2} = \frac{46 + 49}{2} = \frac{95}{2} = 47.5$
The median of the data is 47.5.
Next, we find the absolute deviation of each observation from the median, i.e., $|x_i - M| = |x_i - 47.5|$.
$|36 - 47.5| = |-11.5| = 11.5$
$|42 - 47.5| = |-5.5| = 5.5$
$|45 - 47.5| = |-2.5| = 2.5$
$|46 - 47.5| = |-1.5| = 1.5$
$|46 - 47.5| = |-1.5| = 1.5$
$|49 - 47.5| = |1.5| = 1.5$
$|51 - 47.5| = |3.5| = 3.5$
$|53 - 47.5| = |5.5| = 5.5$
$|60 - 47.5| = |12.5| = 12.5$
$|72 - 47.5| = |24.5| = 24.5$
Now, we find the sum of the absolute deviations.
$\sum\limits_{i=1}^{10} |x_i - M| = 11.5 + 5.5 + 2.5 + 1.5 + 1.5 + 1.5 + 3.5 + 5.5 \ $$ + 12.5 + 24.5 = 70.0$
Finally, we calculate the mean deviation about the median (MD(M)).
MD(M) = $\frac{\sum\limits_{i=1}^{n} |x_i - M|}{n}$
MD(M) = $\frac{70.0}{10} = 7.0$
The mean deviation about the median for the given data is 7.
Find the mean deviation about the mean for the data in Exercises 5 and 6.
Question 5.
| $x_i$ | 5 | 10 | 15 | 20 | 25 |
| $f_i$ | 7 | 4 | 6 | 3 | 5 |
Answer:
The given data is a discrete frequency distribution:
| $x_i$ | $f_i$ |
| 5 | 7 |
| 10 | 4 |
| 15 | 6 |
| 20 | 3 |
| 25 | 5 |
To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).
We calculate $f_i x_i$ for each value and the total frequency $\sum\limits f_i$ and the sum $\sum\limits f_i x_i$.
| $x_i$ | $f_i$ | $f_i x_i$ |
| 5 | 7 | $7 \times 5 = 35$ |
| 10 | 4 | $4 \times 10 = 40$ |
| 15 | 6 | $6 \times 15 = 90$ |
| 20 | 3 | $3 \times 20 = 60$ |
| 25 | 5 | $5 \times 25 = 125$ |
| $\sum\limits f_i = 25$ | $\sum\limits f_i x_i = 350$ |
The mean ($\overline{x}$) is calculated as:
$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i}$
$\overline{x} = \frac{350}{25} = 14$
The mean of the data is 14.
Next, we calculate the absolute deviation of each observation from the mean, $|x_i - \overline{x}| = |x_i - 14|$, and the product $f_i |x_i - 14|$.
| $x_i$ | $f_i$ | $|x_i - 14|$ | $f_i |x_i - 14|$ |
| 5 | 7 | $|5 - 14| = 9$ | $7 \times 9 = 63$ |
| 10 | 4 | $|10 - 14| = 4$ | $4 \times 4 = 16$ |
| 15 | 6 | $|15 - 14| = 1$ | $6 \times 1 = 6$ |
| 20 | 3 | $|20 - 14| = 6$ | $3 \times 6 = 18$ |
| 25 | 5 | $|25 - 14| = 11$ | $5 \times 11 = 55$ |
| $\sum\limits f_i = 25$ | $\sum\limits f_i |x_i - 14| = 158$ |
Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).
MD($\overline{x}$) = $\frac{\sum\limits f_i |x_i - \overline{x}|}{\sum\limits f_i}$
MD($\overline{x}$) = $\frac{158}{25}$
MD($\overline{x}$) = $6.32$
The mean deviation about the mean for the given data is 6.32.
Question 6.
| $x_i$ | 10 | 30 | 50 | 70 | 90 |
| $f_i$ | 4 | 24 | 28 | 16 | 8 |
Answer:
The given data is a discrete frequency distribution:
| $x_i$ | $f_i$ |
| 10 | 4 |
| 30 | 24 |
| 50 | 28 |
| 70 | 16 |
| 90 | 8 |
To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).
We calculate $f_i x_i$ for each value and the total frequency $\sum\limits f_i$ and the sum $\sum\limits f_i x_i$.
| $x_i$ | $f_i$ | $f_i x_i$ |
| 10 | 4 | $4 \times 10 = 40$ |
| 30 | 24 | $24 \times 30 = 720$ |
| 50 | 28 | $28 \times 50 = 1400$ |
| 70 | 16 | $16 \times 70 = 1120$ |
| 90 | 8 | $8 \times 90 = 720$ |
| $\sum\limits f_i = 80$ | $\sum\limits f_i x_i = 4000$ |
The mean ($\overline{x}$) is calculated as:
$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i}$
$\overline{x} = \frac{4000}{80} = 50$
The mean of the data is 50.
Next, we calculate the absolute deviation of each observation from the mean, $|x_i - \overline{x}| = |x_i - 50|$, and the product $f_i |x_i - 50|$.
| $x_i$ | $f_i$ | $|x_i - 50|$ | $f_i |x_i - 50|$ |
| 10 | 4 | $|10 - 50| = 40$ | $4 \times 40 = 160$ |
| 30 | 24 | $|30 - 50| = 20$ | $24 \times 20 = 480$ |
| 50 | 28 | $|50 - 50| = 0$ | $28 \times 0 = 0$ |
| 70 | 16 | $|70 - 50| = 20$ | $16 \times 20 = 320$ |
| 90 | 8 | $|90 - 50| = 40$ | $8 \times 40 = 320$ |
| $\sum\limits f_i = 80$ | $\sum\limits f_i |x_i - 50| = 1280$ |
Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).
MD($\overline{x}$) = $\frac{\sum\limits f_i |x_i - \overline{x}|}{\sum\limits f_i}$
MD($\overline{x}$) = $\frac{1280}{80} = 16$
The mean deviation about the mean for the given data is 16.
Find the mean deviation about the median for the data in Exercises 7 and 8.
Question 7.
| $x_i$ | 5 | 7 | 9 | 10 | 12 | 15 |
| $f_i$ | 8 | 6 | 2 | 2 | 2 | 6 |
Answer:
The given data is a discrete frequency distribution:
| $x_i$ | $f_i$ |
| 5 | 8 |
| 7 | 6 |
| 9 | 2 |
| 10 | 2 |
| 12 | 2 |
| 15 | 6 |
To find the mean deviation about the median, we first need to calculate the median (M).
We calculate the total frequency $N = \sum\limits f_i$ and the cumulative frequencies (c.f.).
| $x_i$ | $f_i$ | Cumulative Frequency (c.f.) |
| 5 | 8 | 8 |
| 7 | 6 | $8 + 6 = 14$ |
| 9 | 2 | $14 + 2 = 16$ |
| 10 | 2 | $16 + 2 = 18$ |
| 12 | 2 | $18 + 2 = 20$ |
| 15 | 6 | $20 + 6 = 26$ |
| $N = \sum\limits f_i = 26$ |
The total number of observations is $N = 26$, which is an even number.
For an even number of observations in a discrete frequency distribution, the median is the average of the values of the $\left(\frac{N}{2}\right)^{\text{th}}$ and $\left(\frac{N}{2} + 1\right)^{\text{th}}$ observations.
$\frac{N}{2} = \frac{26}{2} = 13^{\text{th}}$ observation.
$\frac{N}{2} + 1 = 13 + 1 = 14^{\text{th}}$ observation.
From the cumulative frequency table, the $13^{\text{th}}$ observation falls in the class where c.f. is 14, which corresponds to $x_i = 7$.
The $14^{\text{th}}$ observation also falls in the class where c.f. is 14, which corresponds to $x_i = 7$.
So, the median (M) = $\frac{7 + 7}{2} = 7$.
The median of the data is 7.
Next, we calculate the absolute deviation of each observation from the median, $|x_i - M| = |x_i - 7|$, and the product $f_i |x_i - 7|$.
| $x_i$ | $f_i$ | $|x_i - 7|$ | $f_i |x_i - 7|$ |
| 5 | 8 | $|5 - 7| = 2$ | $8 \times 2 = 16$ |
| 7 | 6 | $|7 - 7| = 0$ | $6 \times 0 = 0$ |
| 9 | 2 | $|9 - 7| = 2$ | $2 \times 2 = 4$ |
| 10 | 2 | $|10 - 7| = 3$ | $2 \times 3 = 6$ |
| 12 | 2 | $|12 - 7| = 5$ | $2 \times 5 = 10$ |
| 15 | 6 | $|15 - 7| = 8$ | $6 \times 8 = 48$ |
| $\sum\limits f_i = 26$ | $\sum\limits f_i |x_i - 7| = 84$ |
Finally, we calculate the mean deviation about the median (MD(M)).
MD(M) = $\frac{\sum\limits f_i |x_i - M|}{\sum\limits f_i}$
MD(M) = $\frac{84}{26} = \frac{42}{13}$
MD(M) $\approx 3.23$ (approximately)
The mean deviation about the median for the given data is $\frac{42}{13}$ or approximately 3.23.
Question 8.
| $x_i$ | 15 | 21 | 27 | 30 | 35 |
| $f_i$ | 3 | 5 | 6 | 7 | 8 |
Answer:
The given data is a discrete frequency distribution:
| $x_i$ | $f_i$ |
| 15 | 3 |
| 21 | 5 |
| 27 | 6 |
| 30 | 7 |
| 35 | 8 |
To find the mean deviation about the median, we first need to calculate the median (M).
We calculate the total frequency $N = \sum\limits f_i$ and the cumulative frequencies (c.f.).
| $x_i$ | $f_i$ | Cumulative Frequency (c.f.) |
| 15 | 3 | 3 |
| 21 | 5 | $3 + 5 = 8$ |
| 27 | 6 | $8 + 6 = 14$ |
| 30 | 7 | $14 + 7 = 21$ |
| 35 | 8 | $21 + 8 = 29$ |
| $N = \sum\limits f_i = 29$ |
The total number of observations is $N = 29$, which is an odd number.
For an odd number of observations in a discrete frequency distribution, the median is the value of the $\left(\frac{N+1}{2}\right)^{\text{th}}$ observation.
$\frac{N+1}{2} = \frac{29+1}{2} = \frac{30}{2} = 15^{\text{th}}$ observation.
From the cumulative frequency table, the $15^{\text{th}}$ observation falls in the class where c.f. is 21, which corresponds to $x_i = 30$.
So, the median (M) = 30.
The median of the data is 30.
Next, we calculate the absolute deviation of each observation from the median, $|x_i - M| = |x_i - 30|$, and the product $f_i |x_i - 30|$.
| $x_i$ | $f_i$ | $|x_i - 30|$ | $f_i |x_i - 30|$ |
| 15 | 3 | $|15 - 30| = 15$ | $3 \times 15 = 45$ |
| 21 | 5 | $|21 - 30| = 9$ | $5 \times 9 = 45$ |
| 27 | 6 | $|27 - 30| = 3$ | $6 \times 3 = 18$ |
| 30 | 7 | $|30 - 30| = 0$ | $7 \times 0 = 0$ |
| 35 | 8 | $|35 - 30| = 5$ | $8 \times 5 = 40$ |
| $\sum\limits f_i = 29$ | $\sum\limits f_i |x_i - 30| = 148$ |
Finally, we calculate the mean deviation about the median (MD(M)).
MD(M) = $\frac{\sum\limits f_i |x_i - M|}{\sum\limits f_i}$
MD(M) = $\frac{148}{29}$
MD(M) $\approx 5.103$ (approximately)
The mean deviation about the median for the given data is $\frac{148}{29}$ or approximately 5.103.
Find the mean deviation about the mean for the data in Exercises 9 and 10.
Question 9.
| Income per day in ₹ | 0-100 | 100-200 | 200-300 | 300-400 | 400-500 | 500-600 | 600-700 | 700-800 |
| Number of persons | 4 | 8 | 9 | 10 | 7 | 5 | 4 | 3 |
Answer:
The given data is a grouped frequency distribution:
| Income per day in $\textsf{₹}$ (Class Interval) | Number of persons ($f_i$) |
| 0-100 | 4 |
| 100-200 | 8 |
| 200-300 | 9 |
| 300-400 | 10 |
| 400-500 | 7 |
| 500-600 | 5 |
| 600-700 | 4 |
| 700-800 | 3 |
To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).
For grouped data, the mean is calculated using the midpoints of the class intervals.
Let $x_i$ be the midpoint of the $i$-th class interval and $f_i$ be the corresponding frequency.
Calculate the midpoints ($x_i$) and the product $f_i x_i$:
| Class Interval | Frequency ($f_i$) | Midpoint ($x_i$) | $f_i x_i$ |
| 0-100 | 4 | $\frac{0+100}{2} = 50$ | $4 \times 50 = 200$ |
| 100-200 | 8 | $\frac{100+200}{2} = 150$ | $8 \times 150 = 1200$ |
| 200-300 | 9 | $\frac{200+300}{2} = 250$ | $9 \times 250 = 2250$ |
| 300-400 | 10 | $\frac{300+400}{2} = 350$ | $10 \times 350 = 3500$ |
| 400-500 | 7 | $\frac{400+500}{2} = 450$ | $7 \times 450 = 3150$ |
| 500-600 | 5 | $\frac{500+600}{2} = 550$ | $5 \times 550 = 2750$ |
| 600-700 | 4 | $\frac{600+700}{2} = 650$ | $4 \times 650 = 2600$ |
| 700-800 | 3 | $\frac{700+800}{2} = 750$ | $3 \times 750 = 2250$ |
| Total | $\sum\limits f_i = 50$ | $\sum\limits f_i x_i = 17900$ |
The mean ($\overline{x}$) is given by the formula:
$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i}$
$\overline{x} = \frac{17900}{50} = \frac{1790}{5} = 358$
The mean income per day is $\textsf{₹}$ 358.
Next, we calculate the absolute deviation of each midpoint from the mean, $|x_i - \overline{x}| = |x_i - 358|$, and the product $f_i |x_i - 358|$.
| Class Interval | $x_i$ | $f_i$ | $|x_i - 358|$ | $f_i |x_i - 358|$ |
| 0-100 | 50 | 4 | $|50 - 358| = |-308| = 308$ | $4 \times 308 = 1232$ |
| 100-200 | 150 | 8 | $|150 - 358| = |-208| = 208$ | $8 \times 208 = 1664$ |
| 200-300 | 250 | 9 | $|250 - 358| = |-108| = 108$ | $9 \times 108 = 972$ |
| 300-400 | 350 | 10 | $|350 - 358| = |-8| = 8$ | $10 \times 8 = 80$ |
| 400-500 | 450 | 7 | $|450 - 358| = |92| = 92$ | $7 \times 92 = 644$ |
| 500-600 | 550 | 5 | $|550 - 358| = |192| = 192$ | $5 \times 192 = 960$ |
| 600-700 | 650 | 4 | $|650 - 358| = |292| = 292$ | $4 \times 292 = 1168$ |
| 700-800 | 750 | 3 | $|750 - 358| = |392| = 392$ | $3 \times 392 = 1176$ |
| Total | $\sum\limits f_i = 50$ | $\sum\limits f_i |x_i - 358| \ $$ = 7896$ |
Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).
MD($\overline{x}$) = $\frac{\sum\limits f_i |x_i - \overline{x}|}{\sum\limits f_i}$
MD($\overline{x}$) = $\frac{7896}{50} = \frac{3948}{25} = 157.92$
The mean deviation about the mean for the given data is $\textsf{₹}$ 157.92.
Question 10.
| Height in cms | 95-105 | 105-115 | 115-125 | 125-135 | 135-145 | 145-155 |
| Number of boys | 9 | 13 | 26 | 30 | 12 | 10 |
Answer:
The given data is a grouped frequency distribution of height and number of boys:
| Height in cms (Class Interval) | Number of boys ($f_i$) |
| 95-105 | 9 |
| 105-115 | 13 |
| 115-125 | 26 |
| 125-135 | 30 |
| 135-145 | 12 |
| 145-155 | 10 |
To find the mean deviation about the median, we first need to calculate the median (M).
We calculate the cumulative frequencies (c.f.) and the total frequency $N = \sum\limits f_i$.
| Class Interval | Frequency ($f_i$) | Cumulative Frequency (c.f.) |
| 95-105 | 9 | 9 |
| 105-115 | 13 | $9 + 13 = 22$ |
| 115-125 | 26 | $22 + 26 = 48$ |
| 125-135 | 30 | $48 + 30 = 78$ |
| 135-145 | 12 | $78 + 12 = 90$ |
| 145-155 | 10 | $90 + 10 = 100$ |
| Total | $N = \sum\limits f_i = 100$ |
The total number of observations is $N = 100$. We need to find the class containing the $\left(\frac{N}{2}\right)^{\text{th}}$ observation.
$\frac{N}{2} = \frac{100}{2} = 50^{\text{th}}$ observation.
The cumulative frequency just greater than or equal to 50 is 78, which corresponds to the class interval 125-135.
So, the median class is 125-135.
For the median class (125-135):
Lower boundary (L) = 125
Frequency of the median class (f) = 30
Cumulative frequency of the class preceding the median class (c.f.) = 48 (c.f. of 115-125 class)
Class size (h) = $135 - 125 = 10$
The median (M) is calculated using the formula:
$M = L + \frac{\frac{N}{2} - c.f.}{f} \times h$
$M = 125 + \frac{50 - 48}{30} \times 10$
$M = 125 + \frac{2}{30} \times 10$
$M = 125 + \frac{1}{15} \times 10$
$M = 125 + \frac{10}{15} = 125 + \frac{2}{3}$
$M = \frac{125 \times 3 + 2}{3} = \frac{375 + 2}{3} = \frac{377}{3}$
The median of the data is $\frac{377}{3}$.
Next, we calculate the midpoints ($x_i$) of each class interval, the absolute deviation from the median $|x_i - \frac{377}{3}|$, and the product $f_i |x_i - \frac{377}{3}|$.
| Class Interval | $f_i$ | Midpoint ($x_i$) | $|x_i - \frac{377}{3}|$ | $f_i |x_i - \frac{377}{3}|$ |
| 95-105 | 9 | 100 | $|100 - \frac{377}{3}| \ $$ = |\frac{300 - 377}{3}| = \frac{77}{3}$ | $9 \times \frac{77}{3} = 3 \times 77 = 231$ |
| 105-115 | 13 | 110 | $|110 - \frac{377}{3}| \ $$ = |\frac{330 - 377}{3}| = \frac{47}{3}$ | $13 \times \frac{47}{3} = \frac{611}{3}$ |
| 115-125 | 26 | 120 | $|120 - \frac{377}{3}| \ $$ = |\frac{360 - 377}{3}| = \frac{17}{3}$ | $26 \times \frac{17}{3} = \frac{442}{3}$ |
| 125-135 | 30 | 130 | $|130 - \frac{377}{3}| \ $$ = |\frac{390 - 377}{3}| = \frac{13}{3}$ | $30 \times \frac{13}{3} = 10 \times 13 = 130$ |
| 135-145 | 12 | 140 | $|140 - \frac{377}{3}| \ $$ = |\frac{420 - 377}{3}| = \frac{43}{3}$ | $12 \times \frac{43}{3} = 4 \times 43 = 172$ |
| 145-155 | 10 | 150 | $|150 - \frac{377}{3}| \ $$ = |\frac{450 - 377}{3}| = \frac{73}{3}$ | $10 \times \frac{73}{3} = \frac{730}{3}$ |
| Total | $\sum\limits f_i = 100$ | $\sum\limits f_i |x_i - \frac{377}{3}| \ $$ = 231 + \frac{611}{3} + \frac{442}{3} \ $$ + 130 + 172 + \frac{730}{3}$ |
Sum of $f_i |x_i - \frac{377}{3}| = (231 + 130 + 172) + (\frac{611 + 442 + 730}{3})$
$= 533 + \frac{1783}{3} = \frac{533 \times 3 + 1783}{3} = \frac{1599 + 1783}{3} = \frac{3382}{3}$
Finally, we calculate the mean deviation about the median (MD(M)).
MD(M) = $\frac{\sum\limits f_i |x_i - M|}{\sum\limits f_i}$
MD(M) = $\frac{\frac{3382}{3}}{100} = \frac{3382}{3 \times 100} = \frac{3382}{300}$
MD(M) = $\frac{1691}{150}$
MD(M) $\approx 11.2733...$
The mean deviation about the median for the given data is $\frac{1691}{150}$ or approximately 11.27.
Question 11. Find the mean deviation about median for the following data :
| Marks | 0-10 | 10-20 | 20-30 | 30-40 | 40-50 | 50-60 |
| Number of Girls | 6 | 8 | 14 | 16 | 4 | 2 |
Answer:
The given data is a grouped frequency distribution of marks obtained by girls:
| Marks (Class Interval) | Number of Girls ($f_i$) |
| 0-10 | 6 |
| 10-20 | 8 |
| 20-30 | 14 |
| 30-40 | 16 |
| 40-50 | 4 |
| 50-60 | 2 |
To find the mean deviation about the median, we first need to calculate the median (M).
We calculate the cumulative frequencies (c.f.) and the total frequency $N = \sum\limits f_i$.
| Class Interval | Frequency ($f_i$) | Cumulative Frequency (c.f.) |
| 0-10 | 6 | 6 |
| 10-20 | 8 | $6 + 8 = 14$ |
| 20-30 | 14 | $14 + 14 = 28$ |
| 30-40 | 16 | $28 + 16 = 44$ |
| 40-50 | 4 | $44 + 4 = 48$ |
| 50-60 | 2 | $48 + 2 = 50$ |
| Total | $N = \sum\limits f_i = 50$ |
The total number of observations is $N = 50$. We need to find the class containing the $\left(\frac{N}{2}\right)^{\text{th}}$ observation.
$\frac{N}{2} = \frac{50}{2} = 25^{\text{th}}$ observation.
The cumulative frequency just greater than or equal to 25 is 28, which corresponds to the class interval 20-30.
So, the median class is 20-30.
For the median class (20-30):
Lower boundary (L) = 20
Frequency of the median class (f) = 14
Cumulative frequency of the class preceding the median class (c.f.) = 14 (c.f. of 10-20 class)
Class size (h) = $30 - 20 = 10$
The median (M) is calculated using the formula:
$M = L + \frac{\frac{N}{2} - c.f.}{f} \times h$
$M = 20 + \frac{25 - 14}{14} \times 10$
$M = 20 + \frac{11}{14} \times 10$
$M = 20 + \frac{110}{14} = 20 + \frac{55}{7}$
$M = \frac{20 \times 7 + 55}{7} = \frac{140 + 55}{7} = \frac{195}{7}$
The median of the data is $\frac{195}{7}$.
Next, we calculate the midpoints ($x_i$) of each class interval, the absolute deviation from the median $|x_i - \frac{195}{7}|$, and the product $f_i |x_i - \frac{195}{7}|$.
Note: $\frac{195}{7} \approx 27.857$
| Class Interval | $f_i$ | Midpoint ($x_i$) | $|x_i - \frac{195}{7}|$ | $f_i |x_i - \frac{195}{7}|$ |
| 0-10 | 6 | 5 | $|5 - \frac{195}{7}| \ $$ = |\frac{35 - 195}{7}| = \frac{160}{7}$ | $6 \times \frac{160}{7} = \frac{960}{7}$ |
| 10-20 | 8 | 15 | $|15 - \frac{195}{7}| \ $$ = |\frac{105 - 195}{7}| = \frac{90}{7}$ | $8 \times \frac{90}{7} = \frac{720}{7}$ |
| 20-30 | 14 | 25 | $|25 - \frac{195}{7}| \ $$ = |\frac{175 - 195}{7}| = \frac{20}{7}$ | $14 \times \frac{20}{7} = 2 \times 20 = 40$ |
| 30-40 | 16 | 35 | $|35 - \frac{195}{7}| \ $$ = |\frac{245 - 195}{7}| = \frac{50}{7}$ | $16 \times \frac{50}{7} = \frac{800}{7}$ |
| 40-50 | 4 | 45 | $|45 - \frac{195}{7}| \ $$ = |\frac{315 - 195}{7}| = \frac{120}{7}$ | $4 \times \frac{120}{7} = \frac{480}{7}$ |
| 50-60 | 2 | 55 | $|55 - \frac{195}{7}| \ $$ = |\frac{385 - 195}{7}| = \frac{190}{7}$ | $2 \times \frac{190}{7} = \frac{380}{7}$ |
| Total | $\sum\limits f_i = 50$ | $\sum\limits f_i |x_i - \frac{195}{7}| \ $$ = \frac{960}{7} + \frac{720}{7} + 40 \ $$ + \frac{800}{7} + \frac{480}{7} + \frac{380}{7}$ |
Sum of $f_i |x_i - \frac{195}{7}| = \frac{960 + 720 + 800 + 480 + 380}{7} + 40$
$= \frac{3340}{7} + 40 = \frac{3340 + 40 \times 7}{7} = \frac{3340 + 280}{7} = \frac{3620}{7}$
Finally, we calculate the mean deviation about the median (MD(M)).
MD(M) = $\frac{\sum\limits f_i |x_i - M|}{\sum\limits f_i}$
MD(M) = $\frac{\frac{3620}{7}}{50} = \frac{3620}{7 \times 50} = \frac{362}{7 \times 5} = \frac{362}{35}$
MD(M) $\approx 10.34$ (approximately)
The mean deviation about the median for the given data is $\frac{362}{35}$ or approximately 10.34.
Question 12. Calculate the mean deviation about median age for the age distribution of 100 persons given below:
| Age (in years) | 16-20 | 21-25 | 26-30 | 31-35 | 36-40 | 41-45 | 46-50 | 51-55 |
| Number | 5 | 6 | 12 | 14 | 26 | 12 | 16 | 9 |
[Hint: Convert the given data into continuous frequency distribution by subtracting 0.5 from the lower limit and adding 0.5 to the upper limit of each class interval]
Answer:
To calculate the mean deviation about the median, we first need to find the median of the given data. The class intervals are in the inclusive form, so we must first convert them into a continuous (exclusive) form as suggested in the hint.
Step 1: Preparing the Frequency Distribution Table and Finding the Median Class
We convert the given class intervals to continuous intervals by subtracting 0.5 from the lower limit and adding 0.5 to the upper limit of each class. We then prepare a table with the cumulative frequencies (c.f.).
| Age (Continuous) | Number of Persons ($f_i$) | Cumulative Frequency (c.f.) |
| 15.5 - 20.5 | 5 | 5 |
| 20.5 - 25.5 | 6 | 11 |
| 25.5 - 30.5 | 12 | 23 |
| 30.5 - 35.5 | 14 | 37 |
| 35.5 - 40.5 | 26 | 63 |
| 40.5 - 45.5 | 12 | 75 |
| 45.5 - 50.5 | 16 | 91 |
| 50.5 - 55.5 | 9 | 100 |
| Total | $N = 100$ |
Here, the total number of observations is $N = 100$.
Now, we find the value of $\frac{N}{2} = \frac{100}{2} = 50$.
From the cumulative frequency column, we see that the cumulative frequency just greater than 50 is 63, which corresponds to the class interval 35.5 - 40.5. Therefore, this is the median class.
Step 2: Calculating the Median
The formula for the median (M) of a continuous frequency distribution is:
$M = l + \frac{\frac{N}{2} - C}{f} \times h$
Where:
- $l$ = lower limit of the median class = 35.5
- $N$ = total frequency = 100
- $C$ = cumulative frequency of the class preceding the median class = 37
- $f$ = frequency of the median class = 26
- $h$ = class size = $40.5 - 35.5 = 5$
Substituting these values into the formula:
$M = 35.5 + \frac{50 - 37}{26} \times 5$
$M = 35.5 + \frac{13}{26} \times 5$
$M = 35.5 + 0.5 \times 5$
$M = 35.5 + 2.5 = 38$
So, the median age is 38 years.
Step 3: Calculating the Mean Deviation about the Median
The formula for the mean deviation about the median (M.D.(M)) is:
$M.D.(M) = \frac{1}{N} \sum\limits_{i=1}^{n} f_i |x_i - M|$
We now create a table to calculate the required values.
| Age Class | Mid-point ($x_i$) | Frequency ($f_i$) | $|x_i - M| = |x_i - 38|$ | $f_i |x_i - M|$ |
| 15.5 - 20.5 | 18 | 5 | $|18 - 38| = 20$ | $5 \times 20 = 100$ |
| 20.5 - 25.5 | 23 | 6 | $|23 - 38| = 15$ | $6 \times 15 = 90$ |
| 25.5 - 30.5 | 28 | 12 | $|28 - 38| = 10$ | $12 \times 10 = 120$ |
| 30.5 - 35.5 | 33 | 14 | $|33 - 38| = 5$ | $14 \times 5 = 70$ |
| 35.5 - 40.5 | 38 | 26 | $|38 - 38| = 0$ | $26 \times 0 = 0$ |
| 40.5 - 45.5 | 43 | 12 | $|43 - 38| = 5$ | $12 \times 5 = 60$ |
| 45.5 - 50.5 | 48 | 16 | $|48 - 38| = 10$ | $16 \times 10 = 160$ |
| 50.5 - 55.5 | 53 | 9 | $|53 - 38| = 15$ | $9 \times 15 = 135$ |
| Total | $N = 100$ | $\sum\limits f_i |x_i - M| = 735$ |
From the table, we have $\sum\limits f_i |x_i - M| = 735$.
Now, we substitute the values into the formula for mean deviation:
$M.D.(M) = \frac{1}{100} \times 735$
$M.D.(M) = 7.35$
Answer:
The mean deviation about the median age for the given distribution is 7.35 years.
Example 8 to 12 (Before Exercise 15.2)
Example 8: Find the variance of the following data:
| 6 | 8 | 10 | 12 | 14 | 16 | 18 | 20 | 22 | 24 |
Answer:
The given data is: 6, 8, 10, 12, 14, 16, 18, 20, 22, 24.
The number of observations is $n = 10$.
First, we find the mean of the data.
Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}} = \frac{\sum\limits x_i}{n}$
Sum of observations ($\sum\limits x_i$) = $6 + 8 + 10 + 12 + 14 + 16 + 18 + 20 + 22 + 24 = 150$
$\overline{x} = \frac{150}{10} = 15$
The mean of the data is 15.
Next, we calculate the deviations from the mean ($x_i - \overline{x}$) and the squared deviations ($(x_i - \overline{x})^2$).
| $x_i$ | $x_i - \overline{x} = x_i - 15$ | $(x_i - \overline{x})^2$ |
| 6 | $6 - 15 = -9$ | $(-9)^2 = 81$ |
| 8 | $8 - 15 = -7$ | $(-7)^2 = 49$ |
| 10 | $10 - 15 = -5$ | $(-5)^2 = 25$ |
| 12 | $12 - 15 = -3$ | $(-3)^2 = 9$ |
| 14 | $14 - 15 = -1$ | $(-1)^2 = 1$ |
| 16 | $16 - 15 = 1$ | $1^2 = 1$ |
| 18 | $18 - 15 = 3$ | $3^2 = 9$ |
| 20 | $20 - 15 = 5$ | $5^2 = 25$ |
| 22 | $22 - 15 = 7$ | $7^2 = 49$ |
| 24 | $24 - 15 = 9$ | $9^2 = 81$ |
| Total | $\sum\limits (x_i - \overline{x}) = 0$ | $\sum\limits (x_i - \overline{x})^2 = 330$ |
The variance ($\sigma^2$) for ungrouped data is given by the formula:
$\sigma^2 = \frac{\sum\limits_{i=1}^{n} (x_i - \overline{x})^2}{n}$
$\sigma^2 = \frac{330}{10}$
$\sigma^2 = 33$
The variance of the given data is 33.
Example 9: Find the variance and standard deviation for the following data:
| $x_i$ | 4 | 8 | 11 | 17 | 20 | 24 | 32 |
| $f_i$ | 3 | 5 | 9 | 5 | 4 | 3 | 1 |
Answer:
The given data is a discrete frequency distribution:
| $x_i$ | $f_i$ |
| 4 | 3 |
| 8 | 5 |
| 11 | 9 |
| 17 | 5 |
| 20 | 4 |
| 24 | 3 |
| 32 | 1 |
To find the variance and standard deviation, we first need to calculate the mean ($\overline{x}$).
We calculate $f_i x_i$ for each value and the total frequency $\sum\limits f_i$ and the sum $\sum\limits f_i x_i$.
| $x_i$ | $f_i$ | $f_i x_i$ |
| 4 | 3 | $3 \times 4 = 12$ |
| 8 | 5 | $5 \times 8 = 40$ |
| 11 | 9 | $9 \times 11 = 99$ |
| 17 | 5 | $5 \times 17 = 85$ |
| 20 | 4 | $4 \times 20 = 80$ |
| 24 | 3 | $3 \times 24 = 72$ |
| 32 | 1 | $1 \times 32 = 32$ |
| $\sum\limits f_i = 30$ | $\sum\limits f_i x_i = 420$ |
The mean ($\overline{x}$) is calculated as:
$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i}$
$\overline{x} = \frac{420}{30} = 14$
The mean of the data is 14.
Next, we calculate the deviations from the mean ($x_i - \overline{x}$), the squared deviations ($(x_i - \overline{x})^2$), and the product $f_i (x_i - \overline{x})^2$.
| $x_i$ | $f_i$ | $x_i - 14$ | $(x_i - 14)^2$ | $f_i (x_i - 14)^2$ |
| 4 | 3 | $4 - 14 = -10$ | $(-10)^2 = 100$ | $3 \times 100 = 300$ |
| 8 | 5 | $8 - 14 = -6$ | $(-6)^2 = 36$ | $5 \times 36 = 180$ |
| 11 | 9 | $11 - 14 = -3$ | $(-3)^2 = 9$ | $9 \times 9 = 81$ |
| 17 | 5 | $17 - 14 = 3$ | $3^2 = 9$ | $5 \times 9 = 45$ |
| 20 | 4 | $20 - 14 = 6$ | $6^2 = 36$ | $4 \times 36 = 144$ |
| 24 | 3 | $24 - 14 = 10$ | $10^2 = 100$ | $3 \times 100 = 300$ |
| 32 | 1 | $32 - 14 = 18$ | $18^2 = 324$ | $1 \times 324 = 324$ |
| $\sum\limits f_i = 30$ | $\sum\limits f_i (x_i - 14)^2 = 1374$ |
The variance ($\sigma^2$) is given by the formula:
$\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{\sum\limits f_i}$
$\sigma^2 = \frac{1374}{30} = \frac{137.4}{3} = 45.8$
The variance of the data is 45.8.
The standard deviation ($\sigma$) is the square root of the variance.
$\sigma = \sqrt{\sigma^2} = \sqrt{45.8}$
Calculating the square root:
$\sqrt{45.8} \approx 6.76757$
The standard deviation is approximately 6.77.
Example 10: Calculate the mean, variance and standard deviation for the following distribution :
| Class | 30-40 | 40-50 | 50-60 | 60-70 | 70-80 | 80-90 | 90-100 |
| Frequency | 3 | 7 | 12 | 15 | 8 | 3 | 2 |
Answer:
The given data is a grouped frequency distribution:
| Class Interval | Frequency ($f_i$) |
| 30-40 | 3 |
| 40-50 | 7 |
| 50-60 | 12 |
| 60-70 | 15 |
| 70-80 | 8 |
| 80-90 | 3 |
| 90-100 | 2 |
First, we calculate the mean ($\overline{x}$).
We find the midpoints ($x_i$) of each class interval and the product $f_i x_i$. We also find the total frequency $N = \sum\limits f_i$ and the sum $\sum\limits f_i x_i$.
| Class Interval | Frequency ($f_i$) | Midpoint ($x_i$) | $f_i x_i$ |
| 30-40 | 3 | $\frac{30+40}{2} = 35$ | $3 \times 35 = 105$ |
| 40-50 | 7 | $\frac{40+50}{2} = 45$ | $7 \times 45 = 315$ |
| 50-60 | 12 | $\frac{50+60}{2} = 55$ | $12 \times 55 = 660$ |
| 60-70 | 15 | $\frac{60+70}{2} = 65$ | $15 \times 65 = 975$ |
| 70-80 | 8 | $\frac{70+80}{2} = 75$ | $8 \times 75 = 600$ |
| 80-90 | 3 | $\frac{80+90}{2} = 85$ | $3 \times 85 = 255$ |
| 90-100 | 2 | $\frac{90+100}{2} = 95$ | $2 \times 95 = 190$ |
| Total | $N = \sum\limits f_i = 50$ | $\sum\limits f_i x_i = 3100$ |
The mean ($\overline{x}$) is given by the formula:
$\overline{x} = \frac{\sum\limits f_i x_i}{N}$
$\overline{x} = \frac{3100}{50} = 62$
The mean of the distribution is 62.
Next, we calculate the variance ($\sigma^2$).
We calculate the deviations from the mean $(x_i - \overline{x})$, the squared deviations $(x_i - \overline{x})^2$, and the product $f_i (x_i - \overline{x})^2$.
| Class Interval | $x_i$ | $f_i$ | $x_i - \overline{x} = x_i - 62$ | $(x_i - \overline{x})^2$ | $f_i (x_i - \overline{x})^2$ |
| 30-40 | 35 | 3 | $35 - 62 = -27$ | $(-27)^2 = 729$ | $3 \times 729 = 2187$ |
| 40-50 | 45 | 7 | $45 - 62 = -17$ | $(-17)^2 = 289$ | $7 \times 289 = 2023$ |
| 50-60 | 55 | 12 | $55 - 62 = -7$ | $(-7)^2 = 49$ | $12 \times 49 = 588$ |
| 60-70 | 65 | 15 | $65 - 62 = 3$ | $3^2 = 9$ | $15 \times 9 = 135$ |
| 70-80 | 75 | 8 | $75 - 62 = 13$ | $13^2 = 169$ | $8 \times 169 = 1352$ |
| 80-90 | 85 | 3 | $85 - 62 = 23$ | $23^2 = 529$ | $3 \times 529 = 1587$ |
| 90-100 | 95 | 2 | $95 - 62 = 33$ | $33^2 = 1089$ | $2 \times 1089 = 2178$ |
| Total | $N = 50$ | $\sum\limits (x_i - \overline{x}) = 0$ | $\sum\limits f_i (x_i - \overline{x})^2 \ $$ = 10050$ |
The variance ($\sigma^2$) is given by the formula:
$\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{N}$
$\sigma^2 = \frac{10050}{50}$
$\sigma^2 = 201$
The variance of the distribution is 201.
Finally, we calculate the standard deviation ($\sigma$), which is the square root of the variance.
$\sigma = \sqrt{\sigma^2} = \sqrt{201}$
Using a calculator, $\sqrt{201} \approx 14.177$
The standard deviation is approximately 14.18.
Example 11: Find the standard deviation for the following data :
| $x_i$ | 3 | 8 | 13 | 18 | 23 |
| $f_i$ | 7 | 10 | 15 | 10 | 6 |
Answer:
The given data is a discrete frequency distribution:
| $x_i$ | $f_i$ |
| 3 | 7 |
| 8 | 10 |
| 13 | 15 |
| 18 | 10 |
| 23 | 6 |
First, we find the mean ($\overline{x}$).
We calculate $f_i x_i$ for each value and the total frequency $\sum\limits f_i$ and the sum $\sum\limits f_i x_i$.
| $x_i$ | $f_i$ | $f_i x_i$ |
| 3 | 7 | $7 \times 3 = 21$ |
| 8 | 10 | $10 \times 8 = 80$ |
| 13 | 15 | $15 \times 13 = 195$ |
| 18 | 10 | $10 \times 18 = 180$ |
| 23 | 6 | $6 \times 23 = 138$ |
| $\sum\limits f_i = 48$ | $\sum\limits f_i x_i = 614$ |
The mean ($\overline{x}$) is calculated as:
$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i} = \frac{614}{48} = \frac{307}{24}$
The mean of the data is $\frac{307}{24} \approx 12.79$.
Next, we calculate the variance ($\sigma^2$) and standard deviation ($\sigma$).
We can use the formula $\sigma^2 = \frac{1}{N} \sum\limits f_i x_i^2 - (\overline{x})^2$. For this, we need $x_i^2$ and $f_i x_i^2$.
| $x_i$ | $f_i$ | $x_i^2$ | $f_i x_i^2$ |
| 3 | 7 | $3^2 = 9$ | $7 \times 9 = 63$ |
| 8 | 10 | $8^2 = 64$ | $10 \times 64 = 640$ |
| 13 | 15 | $13^2 = 169$ | $15 \times 169 = 2535$ |
| 18 | 10 | $18^2 = 324$ | $10 \times 324 = 3240$ |
| 23 | 6 | $23^2 = 529$ | $6 \times 529 = 3174$ |
| $N = \sum\limits f_i = 48$ | $\sum\limits f_i x_i^2 = 9652$ |
The variance ($\sigma^2$) is:
$\sigma^2 = \frac{\sum\limits f_i x_i^2}{N} - (\overline{x})^2$
$\sigma^2 = \frac{9652}{48} - \left(\frac{614}{48}\right)^2$
$\sigma^2 = \frac{2413}{12} - \left(\frac{307}{24}\right)^2$
$\sigma^2 = \frac{2413}{12} - \frac{94249}{576}$
To combine the fractions, we find a common denominator, which is 576 ($12 \times 48 = 576$).
$\sigma^2 = \frac{2413 \times 48}{12 \times 48} - \frac{94249}{576}$
$\sigma^2 = \frac{115824}{576} - \frac{94249}{576}$
$\sigma^2 = \frac{115824 - 94249}{576} = \frac{21575}{576}$
The variance is $\frac{21575}{576}$.
The standard deviation ($\sigma$) is the square root of the variance.
$\sigma = \sqrt{\sigma^2} = \sqrt{\frac{21575}{576}} = \frac{\sqrt{21575}}{\sqrt{576}} = \frac{\sqrt{21575}}{24}$
Calculating the square root of 21575:
$\sqrt{21575} \approx 146.8849$
$\sigma \approx \frac{146.8849}{24} \approx 6.1199$
The standard deviation is approximately 6.12.
Example 12: Calculate mean, variance and standard deviation for the following distribution.
| Classes | 30-40 | 40-50 | 50-60 | 60-70 | 70-80 | 80-90 | 90-100 |
| Frequency | 3 | 7 | 12 | 15 | 8 | 3 | 2 |
Answer:
To calculate the mean, variance, and standard deviation for the given grouped data, we will use the step-deviation method, which simplifies the calculations.
Step 1: Construct the Calculation Table
We create a table to organize the data and intermediate calculations. Let's choose an assumed mean (A) from the mid-points. A good choice is the mid-point of the class with the highest frequency. Here, the class 60-70 has the highest frequency (15), so we'll set the assumed mean $A = 65$. The class size ($h$) is 10.
| Classes | Frequency ($f_i$) | Mid-point ($x_i$) | $y_i = \frac{x_i - A}{h} \ $$ = \frac{x_i - 65}{10}$ | $f_i y_i$ | $y_i^2$ | $f_i y_i^2$ |
| 30-40 | 3 | 35 | -3 | -9 | 9 | 27 |
| 40-50 | 7 | 45 | -2 | -14 | 4 | 28 |
| 50-60 | 12 | 55 | -1 | -12 | 1 | 12 |
| 60-70 | 15 | 65 | 0 | 0 | 0 | 0 |
| 70-80 | 8 | 75 | 1 | 8 | 1 | 8 |
| 80-90 | 3 | 85 | 2 | 6 | 4 | 12 |
| 90-100 | 2 | 95 | 3 | 6 | 9 | 18 |
| Total | $N = \sum\limits f_i = 50$ | $\sum\limits f_i y_i \ $$ = -15$ | $\sum\limits f_i y_i^2 \ $$ = 105$ |
From the table, we have:
$N = 50$, $\sum\limits f_i y_i = -15$, $\sum\limits f_i y_i^2 = 105$, $A = 65$, $h = 10$.
Step 2: Calculate the Mean ($\bar{x}$)
The formula for the mean using the step-deviation method is:
$\bar{x} = A + \left( \frac{\sum\limits f_i y_i}{N} \right) \times h$
Substituting the values from our table:
$\bar{x} = 65 + \left( \frac{-15}{50} \right) \times 10$
$\bar{x} = 65 - \frac{150}{50}$
$\bar{x} = 65 - 3 = 62$
The mean of the distribution is 62.
Step 3: Calculate the Variance ($\sigma^2$)
The formula for the variance using the step-deviation method is:
$\sigma^2 = h^2 \left[ \frac{\sum\limits f_i y_i^2}{N} - \left( \frac{\sum\limits f_i y_i}{N} \right)^2 \right]$
Substituting the values from our table:
$\sigma^2 = 10^2 \left[ \frac{105}{50} - \left( \frac{-15}{50} \right)^2 \right]$
$\sigma^2 = 100 \left[ 2.1 - \left( -0.3 \right)^2 \right]$
$\sigma^2 = 100 [2.1 - 0.09]$
$\sigma^2 = 100 [2.01] = 201$
The variance of the distribution is 201.
Step 4: Calculate the Standard Deviation ($\sigma$)
The standard deviation is the square root of the variance.
$\sigma = \sqrt{\sigma^2} = \sqrt{201}$
$\sigma \approx 14.177$
The standard deviation of the distribution is approximately 14.18.
Exercise 15.2
Find the mean and variance for each of the data in Exercies 1 to 5.
Question 1.
| 6 | 7 | 10 | 12 | 13 | 4 | 8 | 12 |
Answer:
The given data is: 6, 7, 10, 12, 13, 4, 8, 12.
The number of observations is $n = 8$.
First, we find the mean of the data.
Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}}$
Sum of observations = $6 + 7 + 10 + 12 + 13 + 4 + 8 + 12 = 72$
$\overline{x} = \frac{72}{8} = 9$
The mean of the data is 9.
Next, we calculate the deviations from the mean ($x_i - \overline{x}$) and the squared deviations ($(x_i - \overline{x})^2$).
| $x_i$ | $x_i - \overline{x} = x_i - 9$ | $(x_i - \overline{x})^2$ |
| 6 | $6 - 9 = -3$ | $(-3)^2 = 9$ |
| 7 | $7 - 9 = -2$ | $(-2)^2 = 4$ |
| 10 | $10 - 9 = 1$ | $1^2 = 1$ |
| 12 | $12 - 9 = 3$ | $3^2 = 9$ |
| 13 | $13 - 9 = 4$ | $4^2 = 16$ |
| 4 | $4 - 9 = -5$ | $(-5)^2 = 25$ |
| 8 | $8 - 9 = -1$ | $(-1)^2 = 1$ |
| 12 | $12 - 9 = 3$ | $3^2 = 9$ |
| Total | $\sum\limits (x_i - \overline{x}) = 0$ | $\sum\limits (x_i - \overline{x})^2 = 74$ |
The variance ($\sigma^2$) is given by the formula:
$\sigma^2 = \frac{\sum\limits_{i=1}^{n} (x_i - \overline{x})^2}{n}$
$\sigma^2 = \frac{74}{8} = \frac{37}{4} = 9.25$
The mean of the data is 9 and the variance is 9.25.
Question 2. First n natural numbers
Answer:
The data consists of the first $n$ natural numbers: $1, 2, 3, \dots, n$.
The number of observations is $n$.
First, we find the mean of the data.
The sum of the first $n$ natural numbers is given by the formula $\sum\limits_{i=1}^{n} i = \frac{n(n+1)}{2}$.
Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}} = \frac{\sum\limits_{i=1}^{n} i}{n}$
$\overline{x} = \frac{\frac{n(n+1)}{2}}{n} = \frac{n(n+1)}{2n}$
$\overline{x} = \frac{n+1}{2}$
The mean of the first $n$ natural numbers is $\frac{n+1}{2}$.
Next, we find the variance ($\sigma^2$).
The variance can be calculated using the formula $\sigma^2 = \frac{\sum\limits_{i=1}^{n} x_i^2}{n} - (\overline{x})^2$.
The sum of the squares of the first $n$ natural numbers is given by the formula $\sum\limits_{i=1}^{n} i^2 = \frac{n(n+1)(2n+1)}{6}$.
Substituting the values for $\sum\limits x_i^2$ and $\overline{x}$ into the variance formula:
$\sigma^2 = \frac{\frac{n(n+1)(2n+1)}{6}}{n} - \left(\frac{n+1}{2}\right)^2$
$\sigma^2 = \frac{n(n+1)(2n+1)}{6n} - \frac{(n+1)^2}{4}$
$\sigma^2 = \frac{(n+1)(2n+1)}{6} - \frac{(n+1)^2}{4}$
To subtract the fractions, we find a common denominator, which is 12.
$\sigma^2 = \frac{2(n+1)(2n+1)}{12} - \frac{3(n+1)^2}{12}$
$\sigma^2 = \frac{(n+1)[2(2n+1) - 3(n+1)]}{12}$
$\sigma^2 = \frac{(n+1)[4n + 2 - 3n - 3]}{12}$
$\sigma^2 = \frac{(n+1)(n - 1)}{12}$
$\sigma^2 = \frac{n^2 - 1}{12}$
The variance of the first $n$ natural numbers is $\frac{n^2 - 1}{12}$.
The mean is $\frac{n+1}{2}$ and the variance is $\frac{n^2 - 1}{12}$.
Question 3. First 10 multiples of 3
Answer:
The data consists of the first 10 multiples of 3: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30.
The number of observations is $n = 10$.
First, we find the mean of the data.
Sum of observations ($\sum\limits x_i$) = $3 + 6 + 9 + 12 + 15 + 18 + 21 + 24 + 27 + 30$
$\sum\limits x_i = 165$
Mean ($\overline{x}$) = $\frac{\sum\limits x_i}{n}$
$\overline{x} = \frac{165}{10} = 16.5$
The mean of the data is 16.5.
Next, we calculate the variance ($\sigma^2$). We will use the formula $\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\overline{x})^2$.
We calculate the square of each observation ($x_i^2$) and their sum ($\sum\limits x_i^2$).
| $x_i$ | $x_i^2$ |
| 3 | $3^2 = 9$ |
| 6 | $6^2 = 36$ |
| 9 | $9^2 = 81$ |
| 12 | $12^2 = 144$ |
| 15 | $15^2 = 225$ |
| 18 | $18^2 = 324$ |
| 21 | $21^2 = 441$ |
| 24 | $24^2 = 576$ |
| 27 | $27^2 = 729$ |
| 30 | $30^2 = 900$ |
| $\sum\limits x_i^2 = 3465$ |
The mean squared is $(\overline{x})^2 = (16.5)^2 = 272.25$.
The variance ($\sigma^2$) is given by the formula:
$\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\overline{x})^2$
$\sigma^2 = \frac{3465}{10} - 272.25$
$\sigma^2 = 346.5 - 272.25$
$\sigma^2 = 74.25$
The mean of the data is 16.5 and the variance is 74.25.
Question 4.
| $x_i$ | 6 | 10 | 14 | 18 | 24 | 28 | 30 |
| $f_i$ | 2 | 4 | 7 | 12 | 8 | 4 | 3 |
Answer:
The given data is a discrete frequency distribution:
| $x_i$ | $f_i$ |
| 6 | 2 |
| 10 | 4 |
| 14 | 7 |
| 18 | 12 |
| 24 | 8 |
| 28 | 4 |
| 30 | 3 |
First, we find the mean ($\overline{x}$).
We calculate $f_i x_i$ for each value and the total frequency $N = \sum\limits f_i$ and the sum $\sum\limits f_i x_i$.
| $x_i$ | $f_i$ | $f_i x_i$ |
| 6 | 2 | $2 \times 6 = 12$ |
| 10 | 4 | $4 \times 10 = 40$ |
| 14 | 7 | $7 \times 14 = 98$ |
| 18 | 12 | $12 \times 18 = 216$ |
| 24 | 8 | $8 \times 24 = 192$ |
| 28 | 4 | $4 \times 28 = 112$ |
| 30 | 3 | $3 \times 30 = 90$ |
| $N = \sum\limits f_i = 40$ | $\sum\limits f_i x_i = 760$ |
The mean ($\overline{x}$) is calculated as:
$\overline{x} = \frac{\sum\limits f_i x_i}{N}$
$\overline{x} = \frac{760}{40} = 19$
The mean of the data is 19.
Next, we calculate the variance ($\sigma^2$).
We calculate the deviations from the mean $(x_i - \overline{x})$, the squared deviations $(x_i - \overline{x})^2$, and the product $f_i (x_i - \overline{x})^2$.
| $x_i$ | $f_i$ | $x_i - \overline{x} = x_i - 19$ | $(x_i - \overline{x})^2$ | $f_i (x_i - \overline{x})^2$ |
| 6 | 2 | $6 - 19 = -13$ | $(-13)^2 = 169$ | $2 \times 169 = 338$ |
| 10 | 4 | $10 - 19 = -9$ | $(-9)^2 = 81$ | $4 \times 81 = 324$ |
| 14 | 7 | $14 - 19 = -5$ | $(-5)^2 = 25$ | $7 \times 25 = 175$ |
| 18 | 12 | $18 - 19 = -1$ | $(-1)^2 = 1$ | $12 \times 1 = 12$ |
| 24 | 8 | $24 - 19 = 5$ | $5^2 = 25$ | $8 \times 25 = 200$ |
| 28 | 4 | $28 - 19 = 9$ | $9^2 = 81$ | $4 \times 81 = 324$ |
| 30 | 3 | $30 - 19 = 11$ | $11^2 = 121$ | $3 \times 121 = 363$ |
| Total | $N = 40$ | $\sum\limits (x_i - \overline{x}) = 0$ | $\sum\limits f_i (x_i - \overline{x})^2 = 1736$ |
The variance ($\sigma^2$) is given by the formula:
$\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{N}$
$\sigma^2 = \frac{1736}{40}$
$\sigma^2 = \frac{173.6}{4} = 43.4$
The mean of the data is 19 and the variance is 43.4.
Question 5.
| $x_i$ | 92 | 93 | 97 | 98 | 102 | 104 | 109 |
| $f_i$ | 3 | 2 | 3 | 2 | 6 | 3 | 3 |
Answer:
The given data is a discrete frequency distribution:
| $x_i$ | $f_i$ |
| 92 | 3 |
| 93 | 2 |
| 97 | 3 |
| 98 | 2 |
| 102 | 6 |
| 104 | 3 |
| 109 | 3 |
First, we find the mean ($\overline{x}$).
We calculate $f_i x_i$ for each value and the total frequency $N = \sum\limits f_i$ and the sum $\sum\limits f_i x_i$.
| $x_i$ | $f_i$ | $f_i x_i$ |
| 92 | 3 | $3 \times 92 = 276$ |
| 93 | 2 | $2 \times 93 = 186$ |
| 97 | 3 | $3 \times 97 = 291$ |
| 98 | 2 | $2 \times 98 = 196$ |
| 102 | 6 | $6 \times 102 = 612$ |
| 104 | 3 | $3 \times 104 = 312$ |
| 109 | 3 | $3 \times 109 = 327$ |
| $N = \sum\limits f_i = 22$ | $\sum\limits f_i x_i = 2200$ |
The mean ($\overline{x}$) is calculated as:
$\overline{x} = \frac{\sum\limits f_i x_i}{N}$
$\overline{x} = \frac{2200}{22} = 100$
The mean of the data is 100.
Next, we calculate the variance ($\sigma^2$).
We calculate the deviations from the mean $(x_i - \overline{x})$, the squared deviations $(x_i - \overline{x})^2$, and the product $f_i (x_i - \overline{x})^2$.
| $x_i$ | $f_i$ | $x_i - \overline{x} = x_i - 100$ | $(x_i - \overline{x})^2$ | $f_i (x_i - \overline{x})^2$ |
| 92 | 3 | $92 - 100 = -8$ | $(-8)^2 = 64$ | $3 \times 64 = 192$ |
| 93 | 2 | $93 - 100 = -7$ | $(-7)^2 = 49$ | $2 \times 49 = 98$ |
| 97 | 3 | $97 - 100 = -3$ | $(-3)^2 = 9$ | $3 \times 9 = 27$ |
| 98 | 2 | $98 - 100 = -2$ | $(-2)^2 = 4$ | $2 \times 4 = 8$ |
| 102 | 6 | $102 - 100 = 2$ | $2^2 = 4$ | $6 \times 4 = 24$ |
| 104 | 3 | $104 - 100 = 4$ | $4^2 = 16$ | $3 \times 16 = 48$ |
| 109 | 3 | $109 - 100 = 9$ | $9^2 = 81$ | $3 \times 81 = 243$ |
| Total | $N = 22$ | $\sum\limits (x_i - \overline{x}) = 0$ | $\sum\limits f_i (x_i - \overline{x})^2 = 640$ |
The variance ($\sigma^2$) is given by the formula:
$\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{N}$
$\sigma^2 = \frac{640}{22} = \frac{320}{11}$
$\sigma^2 \approx 29.09$ (approximately)
The mean of the data is 100 and the variance is $\frac{320}{11}$ or approximately 29.09.
Question 6. Find the mean and standard deviation using short-cut method.
| $x_i$ | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 |
| $f_i$ | 2 | 1 | 12 | 29 | 25 | 12 | 10 | 4 | 5 |
Answer:
The given data is a discrete frequency distribution:
| $x_i$ | $f_i$ |
| 60 | 2 |
| 61 | 1 |
| 62 | 12 |
| 63 | 29 |
| 64 | 25 |
| 65 | 12 |
| 66 | 10 |
| 67 | 4 |
| 68 | 5 |
We will use the short-cut method to find the mean and standard deviation.
Let the assumed mean be $A = 64$. We calculate the deviations $d_i = x_i - A = x_i - 64$, and the products $f_i d_i$ and $f_i d_i^2$. We also find the total frequency $N = \sum\limits f_i$.
| $x_i$ | $f_i$ | $d_i = x_i - 64$ | $f_i d_i$ | $d_i^2$ | $f_i d_i^2$ |
| 60 | 2 | -4 | $2 \times (-4) = -8$ | 16 | $2 \times 16 = 32$ |
| 61 | 1 | -3 | $1 \times (-3) = -3$ | 9 | $1 \times 9 = 9$ |
| 62 | 12 | -2 | $12 \times (-2) = -24$ | 4 | $12 \times 4 = 48$ |
| 63 | 29 | -1 | $29 \times (-1) = -29$ | 1 | $29 \times 1 = 29$ |
| 64 | 25 | 0 | $25 \times 0 = 0$ | 0 | $25 \times 0 = 0$ |
| 65 | 12 | 1 | $12 \times 1 = 12$ | 1 | $12 \times 1 = 12$ |
| 66 | 10 | 2 | $10 \times 2 = 20$ | 4 | $10 \times 4 = 40$ |
| 67 | 4 | 3 | $4 \times 3 = 12$ | 9 | $4 \times 9 = 36$ |
| 68 | 5 | 4 | $5 \times 4 = 20$ | 16 | $5 \times 16 = 80$ |
| Total | $N = \sum\limits f_i = 100$ | $\sum\limits f_i d_i = 0$ | $\sum\limits f_i d_i^2 = 286$ |
Mean ($\overline{x}$)
The mean is given by the formula: $\overline{x} = A + \frac{\sum\limits f_i d_i}{N}$
$\overline{x} = 64 + \frac{0}{100}$
$\overline{x} = 64 + 0 = 64$
The mean of the data is 64.
Variance ($\sigma^2$)
The variance is given by the formula: $\sigma^2 = \frac{\sum\limits f_i d_i^2}{N} - \left(\frac{\sum\limits f_i d_i}{N}\right)^2$
$\sigma^2 = \frac{286}{100} - \left(\frac{0}{100}\right)^2$
$\sigma^2 = 2.86 - 0^2 = 2.86$
The variance of the data is 2.86.
Standard Deviation ($\sigma$)
The standard deviation is the square root of the variance.
$\sigma = \sqrt{\sigma^2} = \sqrt{2.86}$
Using a calculator, $\sqrt{2.86} \approx 1.691$
The standard deviation is approximately 1.691.
Find the mean and variance for the following frequency distributions in Exercises 7 and 8.
Question 7.
| Classes | 0-30 | 30-60 | 60-90 | 90-120 | 120-150 | 150-180 | 180-210 |
| Frequencies | 2 | 3 | 5 | 10 | 3 | 5 | 2 |
Answer:
The given data is a grouped frequency distribution:
| Class Interval | Frequency ($f_i$) |
| 0-30 | 2 |
| 30-60 | 3 |
| 60-90 | 5 |
| 90-120 | 10 |
| 120-150 | 3 |
| 150-180 | 5 |
| 180-210 | 2 |
First, we calculate the mean ($\overline{x}$).
We find the midpoints ($x_i$) of each class interval and the product $f_i x_i$. We also find the total frequency $N = \sum\limits f_i$ and the sum $\sum\limits f_i x_i$.
| Class Interval | Frequency ($f_i$) | Midpoint ($x_i$) | $f_i x_i$ |
| 0-30 | 2 | $\frac{0+30}{2} = 15$ | $2 \times 15 = 30$ |
| 30-60 | 3 | $\frac{30+60}{2} = 45$ | $3 \times 45 = 135$ |
| 60-90 | 5 | $\frac{60+90}{2} = 75$ | $5 \times 75 = 375$ |
| 90-120 | 10 | $\frac{90+120}{2} = 105$ | $10 \times 105 = 1050$ |
| 120-150 | 3 | $\frac{120+150}{2} = 135$ | $3 \times 135 = 405$ |
| 150-180 | 5 | $\frac{150+180}{2} = 165$ | $5 \times 165 = 825$ |
| 180-210 | 2 | $\frac{180+210}{2} = 195$ | $2 \times 195 = 390$ |
| Total | $N = \sum\limits f_i = 30$ | $\sum\limits f_i x_i = 3210$ |
The mean ($\overline{x}$) is given by the formula:
$\overline{x} = \frac{\sum\limits f_i x_i}{N}$
$\overline{x} = \frac{3210}{30} = \frac{321}{3} = 107$
The mean of the distribution is 107.
Next, we calculate the variance ($\sigma^2$).
We can use the formula $\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{N}$ or $\sigma^2 = \frac{\sum\limits f_i x_i^2}{N} - (\overline{x})^2$. Let's use the first formula by calculating the deviations from the mean $(x_i - \overline{x})$ and their squares.
| Class Interval | $x_i$ | $f_i$ | $x_i - \overline{x} = x_i - 107$ | $(x_i - \overline{x})^2$ | $f_i (x_i - \overline{x})^2$ |
| 0-30 | 15 | 2 | $15 - 107 = -92$ | $(-92)^2 = 8464$ | $2 \times 8464 = 16928$ |
| 30-60 | 45 | 3 | $45 - 107 = -62$ | $(-62)^2 = 3844$ | $3 \times 3844 = 11532$ |
| 60-90 | 75 | 5 | $75 - 107 = -32$ | $(-32)^2 = 1024$ | $5 \times 1024 = 5120$ |
| 90-120 | 105 | 10 | $105 - 107 = -2$ | $(-2)^2 = 4$ | $10 \times 4 = 40$ |
| 120-150 | 135 | 3 | $135 - 107 = 28$ | $28^2 = 784$ | $3 \times 784 = 2352$ |
| 150-180 | 165 | 5 | $165 - 107 = 58$ | $58^2 = 3364$ | $5 \times 3364 = 16820$ |
| 180-210 | 195 | 2 | $195 - 107 = 88$ | $88^2 = 7744$ | $2 \times 7744 = 15488$ |
| Total | $N = 30$ | $\sum\limits (x_i - \overline{x}) = 0$ | $\sum\limits f_i (x_i - \overline{x})^2 \ $$ = 68280$ |
The variance ($\sigma^2$) is given by the formula:
$\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{N}$
$\sigma^2 = \frac{68280}{30}$
$\sigma^2 = \frac{6828}{3} = 2276$
The mean of the distribution is 107 and the variance is 2276.
Question 8.
| Classes | 0-10 | 10-20 | 20-30 | 30-40 | 40-50 |
| Frequencies | 5 | 8 | 15 | 16 | 6 |
Answer:
The given data is a grouped frequency distribution:
| Class Interval | Frequency ($f_i$) |
| 0-10 | 5 |
| 10-20 | 8 |
| 20-30 | 15 |
| 30-40 | 16 |
| 40-50 | 6 |
First, we calculate the mean ($\overline{x}$).
We find the midpoints ($x_i$) of each class interval and the product $f_i x_i$. We also find the total frequency $N = \sum\limits f_i$ and the sum $\sum\limits f_i x_i$.
| Class Interval | Frequency ($f_i$) | Midpoint ($x_i$) | $f_i x_i$ |
| 0-10 | 5 | $\frac{0+10}{2} = 5$ | $5 \times 5 = 25$ |
| 10-20 | 8 | $\frac{10+20}{2} = 15$ | $8 \times 15 = 120$ |
| 20-30 | 15 | $\frac{20+30}{2} = 25$ | $15 \times 25 = 375$ |
| 30-40 | 16 | $\frac{30+40}{2} = 35$ | $16 \times 35 = 560$ |
| 40-50 | 6 | $\frac{40+50}{2} = 45$ | $6 \times 45 = 270$ |
| Total | $N = \sum\limits f_i = 50$ | $\sum\limits f_i x_i = 1350$ |
The mean ($\overline{x}$) is given by the formula:
$\overline{x} = \frac{\sum\limits f_i x_i}{N}$
$\overline{x} = \frac{1350}{50} = \frac{135}{5} = 27$
The mean of the distribution is 27.
Next, we calculate the variance ($\sigma^2$).
We calculate the deviations from the mean $(x_i - \overline{x})$, the squared deviations $(x_i - \overline{x})^2$, and the product $f_i (x_i - \overline{x})^2$.
| Class Interval | $x_i$ | $f_i$ | $x_i - \overline{x} = x_i - 27$ | $(x_i - \overline{x})^2$ | $f_i (x_i - \overline{x})^2$ |
| 0-10 | 5 | 5 | $5 - 27 = -22$ | $(-22)^2 = 484$ | $5 \times 484 = 2420$ |
| 10-20 | 15 | 8 | $15 - 27 = -12$ | $(-12)^2 = 144$ | $8 \times 144 = 1152$ |
| 20-30 | 25 | 15 | $25 - 27 = -2$ | $(-2)^2 = 4$ | $15 \times 4 = 60$ |
| 30-40 | 35 | 16 | $35 - 27 = 8$ | $8^2 = 64$ | $16 \times 64 = 1024$ |
| 40-50 | 45 | 6 | $45 - 27 = 18$ | $18^2 = 324$ | $6 \times 324 = 1944$ |
| Total | $N = 50$ | $\sum\limits (x_i - \overline{x}) = 0$ | $\sum\limits f_i (x_i - \overline{x})^2 = 6600$ |
The variance ($\sigma^2$) is given by the formula:
$\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{N}$
$\sigma^2 = \frac{6600}{50}$
$\sigma^2 = \frac{660}{5} = 132$
The mean of the distribution is 27 and the variance is 132.
Question 9. Find the mean, variance and standard deviation using short-cut method
| Height in cms | 70-75 | 75-80 | 80-85 | 85-90 | 90-95 | 95-100 | 100-105 | 105-110 | 110-115 |
| No. of children | 3 | 4 | 7 | 7 | 15 | 9 | 6 | 6 | 3 |
Answer:
To find the mean, variance, and standard deviation using the short-cut method (also known as the step-deviation method), we will first construct a calculation table. The step-deviation method is ideal here as the class sizes are uniform.
Step 1: Construct the Calculation Table
We choose an assumed mean (A) to simplify calculations. A good choice is the mid-point of the class with the highest frequency. The class 90-95 has the highest frequency (15), so we'll set the assumed mean $A = 92.5$. The class size ($h$) is $75 - 70 = 5$.
| Height in cms | No. of children ($f_i$) | Mid-point ($x_i$) | $y_i = \frac{x_i - A}{h} \ $$ = \frac{x_i - 92.5}{5}$ | $f_i y_i$ | $y_i^2$ | $f_i y_i^2$ |
| 70-75 | 3 | 72.5 | -4 | -12 | 16 | 48 |
| 75-80 | 4 | 77.5 | -3 | -12 | 9 | 36 |
| 80-85 | 7 | 82.5 | -2 | -14 | 4 | 28 |
| 85-90 | 7 | 87.5 | -1 | -7 | 1 | 7 |
| 90-95 | 15 | 92.5 | 0 | 0 | 0 | 0 |
| 95-100 | 9 | 97.5 | 1 | 9 | 1 | 9 |
| 100-105 | 6 | 102.5 | 2 | 12 | 4 | 24 |
| 105-110 | 6 | 107.5 | 3 | 18 | 9 | 54 |
| 110-115 | 3 | 112.5 | 4 | 12 | 16 | 48 |
| Total | $N = \sum\limits f_i = 60$ | $\sum\limits f_i y_i = 6$ | $\sum\limits f_i y_i^2 = 254$ |
From the table, we have:
$N = 60$, $\sum\limits f_i y_i = 6$, $\sum\limits f_i y_i^2 = 254$, $A = 92.5$, $h = 5$.
Step 2: Calculate the Mean ($\bar{x}$)
The formula for the mean using the short-cut method is:
$\bar{x} = A + \left( \frac{\sum\limits f_i y_i}{N} \right) \times h$
Substituting the values from our table:
$\bar{x} = 92.5 + \left( \frac{6}{60} \right) \times 5$
$\bar{x} = 92.5 + (0.1) \times 5$
$\bar{x} = 92.5 + 0.5 = 93$
The mean height is 93 cm.
Step 3: Calculate the Variance ($\sigma^2$)
The formula for the variance using the short-cut method is:
$\sigma^2 = h^2 \left[ \frac{\sum\limits f_i y_i^2}{N} - \left( \frac{\sum\limits f_i y_i}{N} \right)^2 \right]$
Substituting the values from our table:
$\sigma^2 = 5^2 \left[ \frac{254}{60} - \left( \frac{6}{60} \right)^2 \right]$
$\sigma^2 = 25 \left[ \frac{254}{60} - (0.1)^2 \right]$
$\sigma^2 = 25 \left[ 4.2333... - 0.01 \right]$
$\sigma^2 = 25 [4.2233...]$
$\sigma^2 \approx 105.58$
The variance is approximately 105.58 cm².
Step 4: Calculate the Standard Deviation ($\sigma$)
The standard deviation is the square root of the variance.
$\sigma = \sqrt{\sigma^2} \approx \sqrt{105.58}$
$\sigma \approx 10.275$
The standard deviation is approximately 10.28 cm.
Question 10. The diameters of circles (in mm) drawn in a design are given below:
| Diameters | 33-36 | 37-40 | 41-44 | 45-48 | 49-52 |
| No. of circles | 15 | 17 | 21 | 22 | 25 |
Calculate the standard deviation and mean diameter of the circles.
[Hint: First make the data continuous by making the classes as 32.5-36.5, 36.5-40.5, 40.5-44.5, 44.5 - 48.5, 48.5 - 52.5 and then proceed.]
Answer:
To calculate the mean and standard deviation, we will use the short-cut (step-deviation) method. First, we need to convert the discontinuous class intervals into continuous ones, as suggested in the hint.
Step 1: Preparing the Data and Calculation Table
The class intervals are made continuous by subtracting 0.5 from the lower limit and adding 0.5 to the upper limit of each class. We then set up a table for calculations. Let's choose an assumed mean (A) from the mid-points. The class 41-44 (or 40.5-44.5) is in the middle of the distribution, so we will set the assumed mean $A = 42.5$. The class size ($h$) is $36.5 - 32.5 = 4$.
| Diameters (mm) | No. of circles ($f_i$) | Mid-point ($x_i$) | $y_i = \frac{x_i - A}{h} = \frac{x_i - 42.5}{4}$ | $f_i y_i$ | $y_i^2$ | $f_i y_i^2$ |
| 32.5-36.5 | 15 | 34.5 | -2 | -30 | 4 | 60 |
| 36.5-40.5 | 17 | 38.5 | -1 | -17 | 1 | 17 |
| 40.5-44.5 | 21 | 42.5 | 0 | 0 | 0 | 0 |
| 44.5-48.5 | 22 | 46.5 | 1 | 22 | 1 | 22 |
| 48.5-52.5 | 25 | 50.5 | 2 | 50 | 4 | 100 |
| Total | $N = \sum\limits f_i \ $$ = 100$ | $\sum\limits f_i y_i \ $$ = 25$ | $\sum\limits f_i y_i^2 \ $$ = 199$ |
From the table, we have:
$N = 100$, $\sum\limits f_i y_i = 25$, $\sum\limits f_i y_i^2 = 199$, $A = 42.5$, $h = 4$.
Step 2: Calculate the Mean Diameter ($\bar{x}$)
The formula for the mean using the step-deviation method is:
$\bar{x} = A + \left( \frac{\sum\limits f_i y_i}{N} \right) \times h$
Substituting the values from our table:
$\bar{x} = 42.5 + \left( \frac{25}{100} \right) \times 4$
$\bar{x} = 42.5 + (0.25) \times 4$
$\bar{x} = 42.5 + 1 = 43.5$
The mean diameter of the circles is 43.5 mm.
Step 3: Calculate the Variance ($\sigma^2$) and Standard Deviation ($\sigma$)
The formula for the variance using the step-deviation method is:
$\sigma^2 = h^2 \left[ \frac{\sum\limits f_i y_i^2}{N} - \left( \frac{\sum\limits f_i y_i}{N} \right)^2 \right]$
Substituting the values from our table:
$\sigma^2 = 4^2 \left[ \frac{199}{100} - \left( \frac{25}{100} \right)^2 \right]$
$\sigma^2 = 16 \left[ 1.99 - (0.25)^2 \right]$
$\sigma^2 = 16 [1.99 - 0.0625]$
$\sigma^2 = 16 [1.9275] = 30.84$
The standard deviation is the square root of the variance.
$\sigma = \sqrt{30.84}$
$\sigma \approx 5.55$
The standard deviation of the diameters is approximately 5.55 mm.
Example 13 to 15 (Before Exercise 15.3)
Example 13: Two plants A and B of a factory show following results about the number of workers and the wages paid to them
| A | B | |
|---|---|---|
| No. of workers | 5000 | 6000 |
| Average monthly wages | ₹ 2500 | ₹ 2500 |
| Variance of distribution of wages | 81 | 100 |
In which plant, A or B is there greater variability in individual wages?
Answer:
Given information for Plant A and Plant B:
| Plant A | Plant B | |
| Number of workers ($N$) | 5000 | 6000 |
| Average monthly wages ($\overline{x}$) | $\textsf{₹}$ 2500 | $\textsf{₹}$ 2500 |
| Variance of distribution of wages ($\sigma^2$) | 81 | 100 |
To compare the variability in individual wages, we need to calculate the Coefficient of Variation (C.V.) for each plant.
The formula for Coefficient of Variation is $C.V. = \frac{\sigma}{\overline{x}} \times 100$, where $\sigma$ is the standard deviation and $\overline{x}$ is the mean (average wages).
First, calculate the standard deviation ($\sigma$) from the variance ($\sigma^2 = \text{Variance}$).
For Plant A:
Standard Deviation ($\sigma_A$) = $\sqrt{\text{Variance}_A} = \sqrt{81} = 9$
Coefficient of Variation ($C.V._A$) = $\frac{\sigma_A}{\overline{x}_A} \times 100$
$C.V._A = \frac{9}{2500} \times 100 = \frac{900}{2500} = \frac{9}{25} = 0.36$
Coefficient of Variation for Plant A is 0.36%.
For Plant B:
Standard Deviation ($\sigma_B$) = $\sqrt{\text{Variance}_B} = \sqrt{100} = 10$
Coefficient of Variation ($C.V._B$) = $\frac{\sigma_B}{\overline{x}_B} \times 100$
$C.V._B = \frac{10}{2500} \times 100 = \frac{1000}{2500} = \frac{10}{25} = 0.4$
Coefficient of Variation for Plant B is 0.4%.
Comparing the Coefficients of Variation:
$C.V._A = 0.36$
$C.V._B = 0.4$
Since $C.V._B > C.V._A$ ($0.4 > 0.36$), there is greater variability in the wages in Plant B compared to Plant A.
Conclusion: There is greater variability in individual wages in Plant B.
Example 14: Coefficient of variation of two distributions are 60 and 70, and their standard deviations are 21 and 16, respectively. What are their arithmetic means.
Answer:
The formula for the Coefficient of Variation (C.V.) is:
$C.V. = \frac{\sigma}{\overline{x}} \times 100$
where $\sigma$ is the standard deviation and $\overline{x}$ is the arithmetic mean.
We can rearrange this formula to find the arithmetic mean:
$\overline{x} = \frac{\sigma}{C.V.} \times 100$
For the first distribution:
Given: $C.V._1 = 60$, $\sigma_1 = 21$
Arithmetic Mean ($\overline{x}_1$) = $\frac{\sigma_1}{C.V._1} \times 100$
$\overline{x}_1 = \frac{21}{60} \times 100$
$\overline{x}_1 = \frac{\cancel{21}^{7}}{\cancel{60}_{20}} \times \cancel{100}^{5}$
$\overline{x}_1 = 7 \times 5 = 35$
The arithmetic mean of the first distribution is 35.
For the second distribution:
Given: $C.V._2 = 70$, $\sigma_2 = 16$
Arithmetic Mean ($\overline{x}_2$) = $\frac{\sigma_2}{C.V._2} \times 100$
$\overline{x}_2 = \frac{16}{70} \times 100$
$\overline{x}_2 = \frac{160}{7}$
$\overline{x}_2 \approx 22.857$
The arithmetic mean of the second distribution is $\frac{160}{7}$ or approximately 22.86.
Example 15: The following values are calculated in respect of heights and weights of the students of a section of Class XI :
| Height | Weight | |
|---|---|---|
| Mean | 162.6 cm | 52.36 kg |
| Variance | 127.69 cm2 | 23.1361 kg2 |
Can we say that the weights show greater variation than the heights?
Answer:
The given information about the heights and weights of the students is:
For Height:
Mean ($\overline{x}_{\text{Height}}$) = 162.6 cm
Variance ($\sigma^2_{\text{Height}}$) = 127.69 cm$^2$
For Weight:
Mean ($\overline{x}_{\text{Weight}}$) = 52.36 kg
Variance ($\sigma^2_{\text{Weight}}$) = 23.1361 kg$^2$
To compare the variability of two distributions when they are measured in different units (cm and kg) or have significantly different means, we use the Coefficient of Variation (C.V.). A higher Coefficient of Variation indicates greater relative variability.
The formula for the Coefficient of Variation is given by:
$C.V. = \frac{\sigma}{\overline{x}} \times 100$
where $\sigma$ is the standard deviation and $\overline{x}$ is the mean.
First, we need to calculate the standard deviation ($\sigma$) for both height and weight from their respective variances ($\sigma = \sqrt{\sigma^2}$).
For Height:
Standard Deviation ($\sigma_{\text{Height}}$) = $\sqrt{\text{Variance}_{\text{Height}}} = \sqrt{127.69}$
$\sigma_{\text{Height}} = 11.3$ cm
For Weight:
Standard Deviation ($\sigma_{\text{Weight}}$) = $\sqrt{\text{Variance}_{\text{Weight}}} = \sqrt{23.1361}$
$\sigma_{\text{Weight}} = 4.81$ kg
Now, we calculate the Coefficient of Variation for both height and weight.
For Height:
$C.V._{\text{Height}} = \frac{\sigma_{\text{Height}}}{\overline{x}_{\text{Height}}} \times 100$
$C.V._{\text{Height}} = \frac{11.3}{162.6} \times 100$
$C.V._{\text{Height}} \approx 0.069495 \times 100 \approx 6.95\%$
For Weight:
$C.V._{\text{Weight}} = \frac{\sigma_{\text{Weight}}}{\overline{x}_{\text{Weight}}} \times 100$
$C.V._{\text{Weight}} = \frac{4.81}{52.36} \times 100$
$C.V._{\text{Weight}} \approx 0.091864 \times 100 \approx 9.19\%$
Comparing the Coefficients of Variation:
$C.V._{\text{Height}} \approx 6.95\%$
$C.V._{\text{Weight}} \approx 9.19\%$
Since $C.V._{\text{Weight}} > C.V._{\text{Height}}$ ($9.19\% > 6.95\%$), the weights show greater relative variation than the heights.
Conclusion: Yes, the weights show greater variation than the heights because the Coefficient of Variation for weights is greater than that for heights.
Exercise 15.3
Question 1. From the data given below state which group is more variable, A or B?
| Marks | 10-20 | 20-30 | 30-40 | 40-50 | 50-60 | 60-70 | 70-80 |
| Group A | 9 | 17 | 32 | 33 | 40 | 10 | 9 |
| Group B | 10 | 20 | 30 | 25 | 43 | 15 | 7 |
Answer:
To compare the variability of the two groups, A and B, we will calculate the Coefficient of Variation (C.V.) for each group. The group with the higher Coefficient of Variation is considered more variable.
The formula for the Coefficient of Variation is $C.V. = \frac{\sigma}{\overline{x}} \times 100$, where $\sigma$ is the standard deviation and $\overline{x}$ is the mean.
First, we calculate the mean ($\overline{x}$) and standard deviation ($\sigma$) for each group using the grouped frequency distribution method. The classes are continuous. The class size is $h = 20 - 10 = 10$. We calculate the midpoints ($x_i$) for each class.
Midpoints ($x_i$): 15, 25, 35, 45, 55, 65, 75.
We will use the step-deviation method with assumed mean $A = 45$ and $h = 10$.
Let $u_i = \frac{x_i - A}{h} = \frac{x_i - 45}{10}$.
$u_i$ values: -3, -2, -1, 0, 1, 2, 3.
$u_i^2$ values: 9, 4, 1, 0, 1, 4, 9.
For Group A:
Frequencies ($f_{iA}$): 9, 17, 32, 33, 40, 10, 9
Total frequency $N_A = \sum\limits f_{iA} = 9 + 17 + 32 + 33 + 40 + 10 + 9 = 150$
| Class | $x_i$ | $f_{iA}$ | $u_i = \frac{x_i - 45}{10}$ | $f_{iA} u_i$ | $u_i^2$ | $f_{iA} u_i^2$ |
| 10-20 | 15 | 9 | -3 | -27 | 9 | 81 |
| 20-30 | 25 | 17 | -2 | -34 | 4 | 68 |
| 30-40 | 35 | 32 | -1 | -32 | 1 | 32 |
| 40-50 | 45 | 33 | 0 | 0 | 0 | 0 |
| 50-60 | 55 | 40 | 1 | 40 | 1 | 40 |
| 60-70 | 65 | 10 | 2 | 20 | 4 | 40 |
| 70-80 | 75 | 9 | 3 | 27 | 9 | 81 |
| Total | $N_A = 150$ | $\sum\limits f_{iA} u_i = -6$ | $\sum\limits f_{iA} u_i^2 = 342$ |
Mean for Group A ($\overline{x}_A$) = $A + \frac{\sum\limits f_{iA} u_i}{N_A} \times h = 45 + \frac{-6}{150} \times 10 \ $$ = 45 - \frac{60}{150} = 45 - 0.4 = 44.6$
Variance for Group A ($\sigma_A^2$) = $h^2 \left[ \frac{\sum\limits f_{iA} u_i^2}{N_A} - \left(\frac{\sum\limits f_{iA} u_i}{N_A}\right)^2 \right] \ $$ = 10^2 \left[ \frac{342}{150} - \left(\frac{-6}{150}\right)^2 \right] \ $$ = 100 \left[ 2.28 - \left(-\frac{1}{25}\right)^2 \right] = 100 [2.28 - 0.0016] \ $$ = 100 [2.2784] = 227.84$
Standard Deviation for Group A ($\sigma_A$) = $\sqrt{227.84} \approx 15.09437$
Coefficient of Variation for Group A ($C.V._A$) = $\frac{\sigma_A}{\overline{x}_A} \times 100 = \frac{15.09437}{44.6} \times 100 \approx 33.84\%$
For Group B:
Frequencies ($f_{iB}$): 10, 20, 30, 25, 43, 15, 7
Total frequency $N_B = \sum\limits f_{iB} = 10 + 20 + 30 + 25 + 43 + 15 + 7 = 150$
| Class | $x_i$ | $f_{iB}$ | $u_i = \frac{x_i - 45}{10}$ | $f_{iB} u_i$ | $u_i^2$ | $f_{iB} u_i^2$ |
| 10-20 | 15 | 10 | -3 | -30 | 9 | 90 |
| 20-30 | 25 | 20 | -2 | -40 | 4 | 80 |
| 30-40 | 35 | 30 | -1 | -30 | 1 | 30 |
| 40-50 | 45 | 25 | 0 | 0 | 0 | 0 |
| 50-60 | 55 | 43 | 1 | 43 | 1 | 43 |
| 60-70 | 65 | 15 | 2 | 30 | 4 | 60 |
| 70-80 | 75 | 7 | 3 | 21 | 9 | 63 |
| Total | $N_B = 150$ | $\sum\limits f_{iB} u_i = -6$ | $\sum\limits f_{iB} u_i^2 = 366$ |
Mean for Group B ($\overline{x}_B$) = $A + \frac{\sum\limits f_{iB} u_i}{N_B} \times h = 45 + \frac{-6}{150} \times 10 = 45 - 0.4 = 44.6$
Variance for Group B ($\sigma_B^2$) = $h^2 \left[ \frac{\sum\limits f_{iB} u_i^2}{N_B} - \left(\frac{\sum\limits f_{iB} u_i}{N_B}\right)^2 \right] \ $$ = 10^2 \left[ \frac{366}{150} - \left(\frac{-6}{150}\right)^2 \right] \ $$ = 100 \left[ 2.44 - \left(-\frac{1}{25}\right)^2 \right] \ $$ = 100 [2.44 - 0.0016] = 100 [2.4384] = 243.84$
Standard Deviation for Group B ($\sigma_B$) = $\sqrt{243.84} \approx 15.61537$
Coefficient of Variation for Group B ($C.V._B$) = $\frac{\sigma_B}{\overline{x}_B} \times 100 = \frac{15.61537}{44.6} \times 100 \approx 35.01\%$
Comparing the Coefficients of Variation:
$C.V._A \approx 33.84\%$
$C.V._B \approx 35.01\%$
Since the Coefficient of Variation for Group B ($35.01\%$) is greater than the Coefficient of Variation for Group A ($33.84\%$), Group B is more variable.
Conclusion: Group B is more variable than Group A.
Question 2. From the prices of shares X and Y below, find out which is more stable in value:
| X | 35 | 54 | 52 | 53 | 56 | 58 | 52 | 50 | 51 | 49 |
| Y | 108 | 107 | 105 | 105 | 106 | 107 | 104 | 103 | 104 | 101 |
Answer:
To determine which share is more stable in value, we need to compare their variability. Since the mean prices of the two shares are different, the best measure for comparison is the Coefficient of Variation (C.V.). The share with the lower C.V. is considered more stable.
The formula for the Coefficient of Variation is:
$C.V. = \frac{\sigma}{\bar{x}} \times 100$
Where $\sigma$ is the standard deviation and $\bar{x}$ is the mean.
Analysis for Share X
The prices for share X are: 35, 54, 52, 53, 56, 58, 52, 50, 51, 49.
Number of observations, $n = 10$.
1. Calculate the Mean ($\bar{x}_X$)
$\sum\limits x_i = 35+54+52+53+56+58+52+50+51+49 = 510$
$\bar{x}_X = \frac{\sum\limits x_i}{n} = \frac{510}{10} = 51$
2. Calculate the Standard Deviation ($\sigma_X$)
We'll first find the variance ($\sigma_X^2$). The formula is $\sigma_X^2 = \frac{\sum\limits (x_i - \bar{x}_X)^2}{n}$.
| $x_i$ | $x_i - \bar{x}_X = x_i - 51$ | $(x_i - \bar{x}_X)^2$ |
| 35 | -16 | 256 |
| 54 | 3 | 9 |
| 52 | 1 | 1 |
| 53 | 2 | 4 |
| 56 | 5 | 25 |
| 58 | 7 | 49 |
| 52 | 1 | 1 |
| 50 | -1 | 1 |
| 51 | 0 | 0 |
| 49 | -2 | 4 |
| Total | $\sum\limits (x_i - \bar{x}_X)^2 = 350$ |
Variance, $\sigma_X^2 = \frac{350}{10} = 35$.
Standard Deviation, $\sigma_X = \sqrt{35} \approx 5.916$.
3. Calculate the Coefficient of Variation (C.V._X)
$C.V._X = \frac{5.916}{51} \times 100 \approx 11.6$
Analysis for Share Y
The prices for share Y are: 108, 107, 105, 105, 106, 107, 104, 103, 104, 101.
Number of observations, $n = 10$.
1. Calculate the Mean ($\bar{x}_Y$)
$\sum\limits y_i = 108+107+105+105+106+107+104+103+104+101 \ $$ = 1050$
$\bar{x}_Y = \frac{\sum\limits y_i}{n} = \frac{1050}{10} = 105$
2. Calculate the Standard Deviation ($\sigma_Y$)
The formula is $\sigma_Y^2 = \frac{\sum\limits (y_i - \bar{x}_Y)^2}{n}$.
| $y_i$ | $y_i - \bar{x}_Y = y_i - 105$ | $(y_i - \bar{x}_Y)^2$ |
| 108 | 3 | 9 |
| 107 | 2 | 4 |
| 105 | 0 | 0 |
| 105 | 0 | 0 |
| 106 | 1 | 1 |
| 107 | 2 | 4 |
| 104 | -1 | 1 |
| 103 | -2 | 4 |
| 104 | -1 | 1 |
| 101 | -4 | 16 |
| Total | $\sum\limits (y_i - \bar{x}_Y)^2 = 40$ |
Variance, $\sigma_Y^2 = \frac{40}{10} = 4$.
Standard Deviation, $\sigma_Y = \sqrt{4} = 2$.
3. Calculate the Coefficient of Variation (C.V._Y)
$C.V._Y = \frac{2}{105} \times 100 \approx 1.90$
Conclusion
We compare the coefficients of variation for both shares:
- C.V. for Share X $\approx 11.6$
- C.V. for Share Y $\approx 1.90$
Since the Coefficient of Variation for Share Y is smaller than the Coefficient of Variation for Share X ($1.90 < 11.6$), the prices of Share Y are more stable in value.
Question 3. An analysis of monthly wages paid to workers in two firms A and B, belonging to the same industry, gives the following results:
(i) Which firm A or B pays larger amount as monthly wages?
(ii) Which firm, A or B, shows greater variability in individual wages?
| Firm A | Firm B | |
|---|---|---|
| No. of wage earners | 586 | 648 |
| Mean of monthly wages | ||
| Variance of the distribution of wages | 100 | 121 |
Answer:
Given Data:
| Firm A | Firm B | |
| No. of wage earners ($n$) | $n_A = 586$ | $n_B = 648$ |
| Mean of monthly wages ($\bar{x}$) | $\bar{x}_A = \textsf{₹ } 5253$ | $\bar{x}_B = \textsf{₹ } 5253$ |
| Variance of the distribution of wages ($\sigma^2$) | $\sigma_A^2 = 100$ | $\sigma_B^2 = 121$ |
(i) Which firm A or B pays a larger amount as monthly wages?
To find the total amount paid as monthly wages by each firm, we multiply the number of wage earners by the mean monthly wage.
Total monthly wages = Number of wage earners $\times$ Mean monthly wage
For Firm A:
Total Wages$_A = n_A \times \bar{x}_A$
Total Wages$_A = 586 \times 5253 = \textsf{₹ } 30,78,258$
For Firm B:
Total Wages$_B = n_B \times \bar{x}_B$
Total Wages$_B = 648 \times 5253 = \textsf{₹ } 34,03,944$
Comparing the total wages, we see that $\textsf{₹ } 34,03,944 > \textsf{₹ } 30,78,258$.
Therefore, Firm B pays a larger amount as monthly wages.
(ii) Which firm, A or B, shows greater variability in individual wages?
Variability is measured by variance or standard deviation. Since the mean monthly wages for both firms are the same ($\bar{x}_A = \bar{x}_B = \textsf{₹ } 5253$), we can directly compare their variances to determine which has greater variability. The firm with the higher variance has greater variability.
Alternatively, we can calculate the standard deviation ($\sigma$) for each firm.
Standard Deviation = $\sqrt{\text{Variance}}$
For Firm A:
$\sigma_A = \sqrt{100} = 10$
For Firm B:
$\sigma_B = \sqrt{121} = 11$
Comparing the standard deviations, we have $\sigma_B (11) > \sigma_A (10)$.
Since the standard deviation of wages for Firm B is greater than that for Firm A, Firm B shows greater variability in individual wages.
Question 4. The following is the record of goals scored by team A in a football session:
| No. of goals scored | 0 | 1 | 2 | 3 | 4 |
| No. of matches | 1 | 9 | 7 | 5 | 3 |
For the team B, mean number of goals scored per match was 2 with a standard deviation 1.25 goals. Find which team may be considered more consistent?
Answer:
Given:
Data for Team A goals scored:
| No. of goals scored ($x_i$) | No. of matches ($f_i$) |
| 0 | 1 |
| 1 | 9 |
| 2 | 7 |
| 3 | 5 |
| 4 | 3 |
Data for Team B:
Mean number of goals scored ($\overline{x}_B$) = 2
Standard deviation ($\sigma_B$) = 1.25
To Find: Which team is more consistent.
Solution:
Consistency is measured by the inverse of variability. A lower Coefficient of Variation (C.V.) indicates lower variability and thus higher consistency. We will calculate the C.V. for both teams and compare them.
The formula for Coefficient of Variation is $C.V. = \frac{\sigma}{\overline{x}} \times 100$.
Calculations for Team A:
We need to calculate the mean ($\overline{x}_A$) and standard deviation ($\sigma_A$) from the frequency distribution.
Total number of matches ($N_A$) = $\sum\limits f_i = 1 + 9 + 7 + 5 + 3 = 25$
Calculate $\sum\limits f_i x_i$:
| $x_i$ | $f_i$ | $f_i x_i$ |
| 0 | 1 | $1 \times 0 = 0$ |
| 1 | 9 | $9 \times 1 = 9$ |
| 2 | 7 | $7 \times 2 = 14$ |
| 3 | 5 | $5 \times 3 = 15$ |
| 4 | 3 | $3 \times 4 = 12$ |
| Total | $N_A = 25$ | $\sum\limits f_i x_i = 50$ |
Mean ($\overline{x}_A$) = $\frac{\sum\limits f_i x_i}{N_A} = \frac{50}{25} = 2$
The mean number of goals scored per match for Team A is 2.
Calculate variance ($\sigma_A^2$). We use the formula $\sigma_A^2 = \frac{\sum\limits f_i x_i^2}{N_A} - (\overline{x}_A)^2$.
Calculate $\sum\limits f_i x_i^2$:
| $x_i$ | $f_i$ | $x_i^2$ | $f_i x_i^2$ |
| 0 | 1 | 0 | $1 \times 0 = 0$ |
| 1 | 9 | 1 | $9 \times 1 = 9$ |
| 2 | 7 | 4 | $7 \times 4 = 28$ |
| 3 | 5 | 9 | $5 \times 9 = 45$ |
| 4 | 3 | 16 | $3 \times 16 = 48$ |
| Total | $N_A = 25$ | $\sum\limits f_i x_i^2 = 130$ |
Variance ($\sigma_A^2$) = $\frac{130}{25} - (2)^2 = 5.2 - 4 = 1.2$
Standard Deviation ($\sigma_A$) = $\sqrt{1.2} \approx 1.0954$
Coefficient of Variation for Team A ($C.V._A$) = $\frac{\sigma_A}{\overline{x}_A} \times 100 = \frac{1.0954}{2} \times 100 \approx 54.77\%$
Calculations for Team B:
Mean ($\overline{x}_B$) = 2
Standard Deviation ($\sigma_B$) = 1.25
Coefficient of Variation for Team B ($C.V._B$) = $\frac{\sigma_B}{\overline{x}_B} \times 100 = \frac{1.25}{2} \times 100 = 0.625 \times 100 = 62.5\%$
Comparison of Coefficients of Variation:
$C.V._A \approx 54.77\%$
$C.V._B = 62.5\%$
Since $C.V._A < C.V._B$ ($54.77\% < 62.5\%$), Team A has lower relative variability in the number of goals scored per match compared to Team B. Therefore, Team A is more consistent.
Conclusion: Team A may be considered more consistent.
Question 5. The sum and sum of squares corresponding to length x (in cm) and weight y (in gm) of 50 plant products are given below:
$\sum\limits_{i=1}^{50} x_i = 212 \;,\; \sum\limits_{i=1}^{50} x_i^2 = 902.8 \ ,$ $ \sum\limits_{i=1}^{50} y_i = 261 \;,\; \sum\limits_{i=1}^{50} y_i^2 = 1457.6$
Which is more varying, the length or weight?
Answer:
Given:
Number of plant products, $n = 50$.
Sum of lengths: $\sum\limits_{i=1}^{50} x_i = 212$ cm
Sum of squares of lengths: $\sum\limits_{i=1}^{50} x_i^2 = 902.8$ cm$^2$
Sum of weights: $\sum\limits_{i=1}^{50} y_i = 261$ gm
Sum of squares of weights: $\sum\limits_{i=1}^{50} y_i^2 = 1457.6$ gm$^2$
To Find: Which is more varying, the length or weight.
Solution:
To compare the variability of two distributions that are measured in different units (cm and gm), we calculate the Coefficient of Variation (C.V.) for each distribution. The distribution with the higher C.V. is considered more varying.
The Coefficient of Variation is given by the formula:
$C.V. = \frac{\sigma}{\overline{x}} \times 100$
where $\sigma$ is the standard deviation and $\overline{x}$ is the mean.
The standard deviation is the square root of the variance ($\sigma = \sqrt{\sigma^2}$). The variance is calculated as $\sigma^2 = \frac{\sum\limits z_i^2}{n} - (\overline{z})^2$, where $z$ represents the variable (either $x$ or $y$) and $\overline{z} = \frac{\sum\limits z_i}{n}$.
Calculations for Length (x):
Mean of length ($\overline{x}_x$) = $\frac{\sum\limits x_i}{n} = \frac{212}{50} = 4.24$ cm
Variance of length ($\sigma_x^2$) = $\frac{\sum\limits x_i^2}{n} - (\overline{x}_x)^2$
$\sigma_x^2 = \frac{902.8}{50} - (4.24)^2$
$\sigma_x^2 = 18.056 - 17.9776 = 0.0784$ cm$^2$
Standard Deviation of length ($\sigma_x$) = $\sqrt{0.0784} = 0.28$ cm
Coefficient of Variation for length ($C.V._x$) = $\frac{\sigma_x}{\overline{x}_x} \times 100$
$C.V._x = \frac{0.28}{4.24} \times 100 = \frac{28}{424} \times 100 = \frac{7}{106} \times 100 = \frac{700}{106} = \frac{350}{53} \approx 6.60\%$
Calculations for Weight (y):
Mean of weight ($\overline{x}_y$) = $\frac{\sum\limits y_i}{n} = \frac{261}{50} = 5.22$ gm
Variance of weight ($\sigma_y^2$) = $\frac{\sum\limits y_i^2}{n} - (\overline{x}_y)^2$
$\sigma_y^2 = \frac{1457.6}{50} - (5.22)^2$
$\sigma_y^2 = 29.152 - 27.2484 = 1.9036$ gm$^2$
Standard Deviation of weight ($\sigma_y$) = $\sqrt{1.9036} = 1.38$ gm
Coefficient of Variation for weight ($C.V._y$) = $\frac{\sigma_y}{\overline{x}_y} \times 100$
$C.V._y = \frac{1.38}{5.22} \times 100 = \frac{138}{522} \times 100 = \frac{23}{87} \times 100 = \frac{2300}{87} \approx 26.44\%$
Comparison of Coefficients of Variation:
$C.V._x \approx 6.60\%$
$C.V._y \approx 26.44\%$
Since $C.V._y > C.V._x$ ($26.44\% > 6.60\%$), the weight shows greater relative variability than the length.
Conclusion: The weight is more varying than the length.
Example 16 to 19 - Miscellaneous Examples
Example 16: The variance of 20 observations is 5. If each observation is multiplied by 2, find the new variance of the resulting observations
Answer:
Given:
Number of observations, $n = 20$.
Variance of the original observations ($\sigma^2$) = 5.
Let the original observations be $x_1, x_2, \dots, x_{20}$.
The mean of the original observations is $\overline{x} = \frac{1}{n} \sum\limits_{i=1}^{n} x_i$.
The variance of the original observations is given by:
$\sigma^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (x_i - \overline{x})^2$
We are given $\sigma^2 = 5$.
Each observation is multiplied by 2. Let the new observations be $y_i$.
$y_i = 2x_i$, for $i = 1, 2, \dots, 20$.
The number of new observations is still $n = 20$.
Let the mean of the new observations be $\overline{y}$.
$\overline{y} = \frac{1}{n} \sum\limits_{i=1}^{n} y_i = \frac{1}{20} \sum\limits_{i=1}^{20} (2x_i)$
$\overline{y} = \frac{2}{20} \sum\limits_{i=1}^{20} x_i = 2 \left(\frac{1}{20} \sum\limits_{i=1}^{20} x_i\right)$
$\overline{y} = 2\overline{x}$
The new mean is twice the original mean.
The variance of the new observations ($\sigma_{\text{new}}^2$) is given by:
$\sigma_{\text{new}}^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (y_i - \overline{y})^2$
Substitute $y_i = 2x_i$ and $\overline{y} = 2\overline{x}$ into the formula:
$\sigma_{\text{new}}^2 = \frac{1}{20} \sum\limits_{i=1}^{20} (2x_i - 2\overline{x})^2$
Factor out 2 from the term inside the square:
$\sigma_{\text{new}}^2 = \frac{1}{20} \sum\limits_{i=1}^{20} (2(x_i - \overline{x}))^2$
Square the term $2(x_i - \overline{x})$:
$\sigma_{\text{new}}^2 = \frac{1}{20} \sum\limits_{i=1}^{20} 4(x_i - \overline{x})^2$
Factor out the constant 4 from the summation:
$\sigma_{\text{new}}^2 = 4 \left( \frac{1}{20} \sum\limits_{i=1}^{20} (x_i - \overline{x})^2 \right)$
The expression in the parenthesis is the original variance, $\sigma^2$.
$\sigma_{\text{new}}^2 = 4 \times \sigma^2$
Substitute the given value of the original variance ($\sigma^2 = 5$):
$\sigma_{\text{new}}^2 = 4 \times 5$
$\sigma_{\text{new}}^2 = 20$
The new variance of the resulting observations is 20.
Example 17: The mean of 5 observations is 4.4 and their variance is 8.24. If three of the observations are 1, 2 and 6, find the other two observations.
Answer:
Given:
Number of observations, $n = 5$.
Mean of the observations, $\bar{x} = 4.4$.
Variance of the observations, $\sigma^2 = 8.24$.
Three of the five observations are 1, 2, and 6.
To Find:
The other two observations.
Solution:
Let the two unknown observations be $x$ and $y$.
The five observations are 1, 2, 6, $x$, and $y$.
Step 1: Use the Mean to form the first equation
The formula for the mean is $\bar{x} = \frac{\sum\limits x_i}{n}$.
Substitute the given values:
$4.4 = \frac{1 + 2 + 6 + x + y}{5}$
$4.4 \times 5 = 9 + x + y$
$22 = 9 + x + y$
$x + y = 22 - 9$
$x + y = 13$
... (i)
Step 2: Use the Variance to form the second equation
The formula for variance is $\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\bar{x})^2$.
Substitute the given values:
$8.24 = \frac{1^2 + 2^2 + 6^2 + x^2 + y^2}{5} - (4.4)^2$
$8.24 = \frac{1 + 4 + 36 + x^2 + y^2}{5} - 19.36$
$8.24 + 19.36 = \frac{41 + x^2 + y^2}{5}$
$27.6 = \frac{41 + x^2 + y^2}{5}$
$27.6 \times 5 = 41 + x^2 + y^2$
$138 = 41 + x^2 + y^2$
$x^2 + y^2 = 138 - 41$
$x^2 + y^2 = 97$
... (ii)
Step 3: Solve the system of two equations
From equation (i), we have $y = 13 - x$.
Substitute this expression for $y$ into equation (ii):
$x^2 + (13 - x)^2 = 97$
Expand the squared term:
$x^2 + (169 - 26x + x^2) = 97$
Combine like terms and set the equation to zero:
$2x^2 - 26x + 169 - 97 = 0$
$2x^2 - 26x + 72 = 0$
Divide the entire equation by 2 to simplify it:
$x^2 - 13x + 36 = 0$
Factor the quadratic equation:
$(x - 4)(x - 9) = 0$
This gives two possible values for $x$: $x = 4$ or $x = 9$.
Now, find the corresponding values for $y$ using $y = 13 - x$:
- If $x = 4$, then $y = 13 - 4 = 9$.
- If $x = 9$, then $y = 13 - 9 = 4$.
In either case, the two unknown observations are 4 and 9.
Answer:
The other two observations are 4 and 9.
Example 18: If each of the observation x1 , x2 , ...,xn is increased by ‘a’, where a is a negative or positive number, show that the variance remains unchanged.
Answer:
Given:
Let the original set of observations be $x_1, x_2, \dots, x_n$.
The mean of these observations is $\bar{x} = \frac{1}{n} \sum\limits_{i=1}^{n} x_i$.
The variance of these observations is $\sigma_x^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (x_i - \bar{x})^2$.
A new set of observations, $y_1, y_2, \dots, y_n$, is formed by adding a constant 'a' to each original observation, such that $y_i = x_i + a$ for all $i=1, 2, \dots, n$.
To Prove:
The variance of the new set of observations ($\sigma_y^2$) is equal to the variance of the original set of observations ($\sigma_x^2$). That is, we need to show that $\sigma_y^2 = \sigma_x^2$.
Proof:
Step 1: Find the mean of the new observations ($\bar{y}$)
The mean of the new observations is the sum of all new observations divided by their count, $n$.
$\bar{y} = \frac{1}{n} \sum\limits_{i=1}^{n} y_i$
Substitute $y_i = x_i + a$:
$\bar{y} = \frac{1}{n} \sum\limits_{i=1}^{n} (x_i + a)$
Separate the terms in the summation:
$\bar{y} = \frac{1}{n} \left( \sum\limits_{i=1}^{n} x_i + \sum\limits_{i=1}^{n} a \right)$
The sum of a constant 'a' repeated 'n' times is $na$.
$\bar{y} = \frac{1}{n} \left( \sum\limits_{i=1}^{n} x_i + na \right)$
$\bar{y} = \frac{1}{n} \sum\limits_{i=1}^{n} x_i + \frac{na}{n}$
Since $\frac{1}{n} \sum\limits_{i=1}^{n} x_i = \bar{x}$, we have:
$\bar{y} = \bar{x} + a$
... (i)
This shows that the new mean is the old mean increased by 'a'.
Step 2: Calculate the variance of the new observations ($\sigma_y^2$)
The formula for the variance of the new observations is:
$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (y_i - \bar{y})^2$
Now, substitute $y_i = x_i + a$ and $\bar{y} = \bar{x} + a$ into this formula:
$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} ((x_i + a) - (\bar{x} + a))^2$
Simplify the term inside the square:
$(x_i + a) - (\bar{x} + a) = x_i + a - \bar{x} - a = x_i - \bar{x}$
The constant 'a' cancels out. Now, substitute this simplified term back into the variance formula:
$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (x_i - \bar{x})^2$
This expression is the exact formula for the variance of the original observations, $\sigma_x^2$.
Therefore, we can conclude that:
$\sigma_y^2 = \sigma_x^2$
This shows that adding a constant 'a' to each observation does not change the variance. The spread or dispersion of the data points relative to their mean remains the same.
Conclusion:
We have shown that if each observation is increased by a constant 'a', the variance remains unchanged. Hence Proved.
Example 19: The mean and standard deviation of 100 observations were calculated as 40 and 5.1, respectively by a student who took by mistake 50 instead of 40 for one observation. What are the correct mean and standard deviation?
Answer:
Given:
Number of observations, $n = 100$.
Incorrect mean, $\bar{x}_{\text{incorrect}} = 40$.
Incorrect standard deviation, $\sigma_{\text{incorrect}} = 5.1$.
Incorrect observation = 50.
Correct observation = 40.
To Find:
The correct mean ($\bar{x}_{\text{correct}}$) and the correct standard deviation ($\sigma_{\text{correct}}$).
Solution:
Step 1: Calculate the Correct Mean
First, we find the incorrect sum of all observations using the incorrect mean.
Incorrect sum = $n \times \bar{x}_{\text{incorrect}} = 100 \times 40 = 4000$
Next, we find the correct sum by subtracting the incorrect value and adding the correct value.
Correct sum = Incorrect sum - Incorrect value + Correct value
Correct sum = $4000 - 50 + 40 = 3990$
Now, we can calculate the correct mean.
$\bar{x}_{\text{correct}} = \frac{\text{Correct sum}}{n} = \frac{3990}{100} = 39.9$
So, the correct mean is 39.9.
Step 2: Calculate the Correct Standard Deviation
To find the correct standard deviation, we first need to find the correct variance. We start by using the formula for the incorrect variance to find the incorrect sum of squares ($\sum\limits x_i^2$).
The formula for variance is $\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\bar{x})^2$.
From this, the sum of squares is $\sum\limits x_i^2 = n(\sigma^2 + (\bar{x})^2)$.
Using the incorrect values:
Incorrect variance, $\sigma_{\text{incorrect}}^2 = (5.1)^2 = 26.01$.
Incorrect $\sum\limits x_i^2 = 100 (26.01 + (40)^2)$
Incorrect $\sum\limits x_i^2 = 100 (26.01 + 1600)$
Incorrect $\sum\limits x_i^2 = 100 (1626.01) = 162601$
Now, we find the correct sum of squares by subtracting the square of the incorrect value and adding the square of the correct value.
Correct $\sum\limits x_i^2 = \text{Incorrect } \sum\limits x_i^2 - (\text{Incorrect value})^2 \ $$ + (\text{Correct value})^2$
Correct $\sum\limits x_i^2 = 162601 - (50)^2 + (40)^2$
Correct $\sum\limits x_i^2 = 162601 - 2500 + 1600$
Correct $\sum\limits x_i^2 = 161701$
Now, we can calculate the correct variance using the correct sum of squares and the correct mean.
$\sigma_{\text{correct}}^2 = \frac{\text{Correct } \sum\limits x_i^2}{n} - (\bar{x}_{\text{correct}})^2$
$\sigma_{\text{correct}}^2 = \frac{161701}{100} - (39.9)^2$
$\sigma_{\text{correct}}^2 = 1617.01 - 1592.01 = 25$
Finally, the correct standard deviation is the square root of the correct variance.
$\sigma_{\text{correct}} = \sqrt{25} = 5$
So, the correct standard deviation is 5.
Answer:
The correct mean is 39.9 and the correct standard deviation is 5.
Miscellaneous Exercise On Chapter 15
Question 1. The mean and variance of eight observations are 9 and 9.25, respectively. If six of the observations are 6, 7, 10, 12, 12 and 13, find the remaining two observations.
Answer:
Given:
Number of observations, $n = 8$.
Mean of observations ($\overline{x}$) = 9.
Variance of observations ($\sigma^2$) = 9.25.
Six of the observations are 6, 7, 10, 12, 12, and 13.
To Find: The remaining two observations.
Solution:
Let the two remaining observations be $a$ and $b$. The eight observations are 6, 7, 10, 12, 12, 13, $a$, and $b$.
The mean of the observations is given by the formula:
$\overline{x} = \frac{\sum\limits x_i}{n}$
The sum of the eight observations is:
$\sum\limits x_i = 6 + 7 + 10 + 12 + 12 + 13 + a + b = 60 + a + b$
We are given $\overline{x} = 9$ and $n = 8$. Substitute these values into the mean formula:
$9 = \frac{60 + a + b}{8}$
Multiply both sides by 8:
$9 \times 8 = 60 + a + b$
$72 = 60 + a + b$
Subtract 60 from both sides:
$a + b = 12$
... (i)
The variance of the observations is given by the formula:
$\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\overline{x})^2$
The sum of the squares of the eight observations is:
$\sum\limits x_i^2 = 6^2 + 7^2 + 10^2 + 12^2 + 12^2 + 13^2 + a^2 + b^2$
$\sum\limits x_i^2 = 36 + 49 + 100 + 144 + 144 + 169 + a^2 + b^2$
$\sum\limits x_i^2 = 642 + a^2 + b^2$
We are given $\sigma^2 = 9.25$ and $\overline{x} = 9$. Substitute these values into the variance formula:
$9.25 = \frac{642 + a^2 + b^2}{8} - (9)^2$
$9.25 = \frac{642 + a^2 + b^2}{8} - 81$
Add 81 to both sides:
$9.25 + 81 = \frac{642 + a^2 + b^2}{8}$
$90.25 = \frac{642 + a^2 + b^2}{8}$
Multiply both sides by 8:
$90.25 \times 8 = 642 + a^2 + b^2$
$722 = 642 + a^2 + b^2$
Subtract 642 from both sides:
$a^2 + b^2 = 80$
... (ii)
Now we have a system of two equations with two variables $a$ and $b$:
1) $a + b = 12$
2) $a^2 + b^2 = 80$
From equation (i), we can express $b$ in terms of $a$: $b = 12 - a$.
Substitute this expression for $b$ into equation (ii):
$a^2 + (12 - a)^2 = 80$
Expand $(12 - a)^2$:
$a^2 + (144 - 24a + a^2) = 80$
Combine like terms:
$2a^2 - 24a + 144 = 80$
Subtract 80 from both sides:
$2a^2 - 24a + 144 - 80 = 0$
$2a^2 - 24a + 64 = 0$
Divide the entire equation by 2:
$a^2 - 12a + 32 = 0$
This is a quadratic equation in $a$. We can factor this equation. We look for two numbers that multiply to 32 and add up to -12. These numbers are -4 and -8.
So, we can factor the quadratic equation as:
$(a - 4)(a - 8) = 0$
This gives two possible values for $a$:
$a - 4 = 0 \implies a = 4$
or
$a - 8 = 0 \implies a = 8$
Case 1: If $a = 4$, substitute this into equation (i) to find $b$:
$4 + b = 12 \implies b = 12 - 4 = 8$
In this case, the other two observations are 4 and 8.
Case 2: If $a = 8$, substitute this into equation (i) to find $b$:
$8 + b = 12 \implies b = 12 - 8 = 4$
In this case, the other two observations are 8 and 4.
Both cases result in the same pair of numbers for the remaining observations.
Let's verify if these values satisfy equation (ii): $a^2 + b^2 = 80$.
If $a=4$ and $b=8$, then $4^2 + 8^2 = 16 + 64 = 80$. This is correct.
The remaining two observations are 4 and 8.
Question 2. The mean and variance of 7 observations are 8 and 16, respectively. If five of the observations are 2, 4, 10, 12, 14. Find the remaining two observations.
Answer:
Given:
Number of observations, $n = 7$.
Mean of observations ($\overline{x}$) = 8.
Variance of observations ($\sigma^2$) = 16.
Five of the observations are 2, 4, 10, 12, and 14.
To Find: The remaining two observations.
Solution:
Let the two remaining observations be $a$ and $b$. The seven observations are 2, 4, 10, 12, 14, $a$, and $b$.
The mean of the observations is given by the formula:
$\overline{x} = \frac{\sum\limits x_i}{n}$
The sum of the seven observations is:
$\sum\limits x_i = 2 + 4 + 10 + 12 + 14 + a + b = 42 + a + b$
We are given $\overline{x} = 8$ and $n = 7$. Substitute these values into the mean formula:
$8 = \frac{42 + a + b}{7}$
Multiply both sides by 7:
$8 \times 7 = 42 + a + b$
$56 = 42 + a + b$
Subtract 42 from both sides:
$a + b = 14$
... (i)
The variance of the observations is given by the formula:
$\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\overline{x})^2$
The sum of the squares of the seven observations is:
$\sum\limits x_i^2 = 2^2 + 4^2 + 10^2 + 12^2 + 14^2 + a^2 + b^2$
$\sum\limits x_i^2 = 4 + 16 + 100 + 144 + 196 + a^2 + b^2$
$\sum\limits x_i^2 = 460 + a^2 + b^2$
We are given $\sigma^2 = 16$ and $\overline{x} = 8$. Substitute these values into the variance formula:
$16 = \frac{460 + a^2 + b^2}{7} - (8)^2$
$16 = \frac{460 + a^2 + b^2}{7} - 64$
Add 64 to both sides:
$16 + 64 = \frac{460 + a^2 + b^2}{7}$
$80 = \frac{460 + a^2 + b^2}{7}$
Multiply both sides by 7:
$80 \times 7 = 460 + a^2 + b^2$
$560 = 460 + a^2 + b^2$
Subtract 460 from both sides:
$a^2 + b^2 = 100$
... (ii)
Now we have a system of two equations with two variables $a$ and $b$:
1) $a + b = 14$
2) $a^2 + b^2 = 100$
From equation (i), we can express $b$ in terms of $a$: $b = 14 - a$.
Substitute this expression for $b$ into equation (ii):
$a^2 + (14 - a)^2 = 100$
Expand $(14 - a)^2$ using the formula $(p - q)^2 = p^2 - 2pq + q^2$:
$a^2 + (14^2 - 2 \times 14 \times a + a^2) = 100$
$a^2 + 196 - 28a + a^2 = 100$
Combine like terms:
$2a^2 - 28a + 196 = 100$
Subtract 100 from both sides:
$2a^2 - 28a + 196 - 100 = 0$
$2a^2 - 28a + 96 = 0$
Divide the entire equation by 2:
$a^2 - 14a + 48 = 0$
This is a quadratic equation in $a$. We can solve it by factoring. We look for two numbers that multiply to 48 and add up to -14. These numbers are -6 and -8.
So, we can factor the quadratic equation as:
$(a - 6)(a - 8) = 0$
This gives two possible values for $a$:
$a - 6 = 0 \implies a = 6$
or
$a - 8 = 0 \implies a = 8$
Case 1: If $a = 6$, substitute this into equation (i) to find $b$:
$6 + b = 14 \implies b = 14 - 6 = 8$
In this case, the other two observations are 6 and 8.
Case 2: If $a = 8$, substitute this into equation (i) to find $b$:
$8 + b = 14 \implies b = 14 - 8 = 6$
In this case, the other two observations are 8 and 6.
Both cases result in the same pair of numbers for the remaining observations.
Let's verify if these values satisfy equation (ii): $a^2 + b^2 = 100$.
If $a=6$ and $b=8$, then $6^2 + 8^2 = 36 + 64 = 100$. This is correct.
The remaining two observations are 6 and 8.
Question 3. The mean and standard deviation of six observations are 8 and 4, respectively. If each observation is multiplied by 3, find the new mean and new standard deviation of the resulting observations.
Answer:
Given:
Number of observations, $n = 6$.
Mean of original observations ($\overline{x}_{\text{original}}$) = 8.
Standard deviation of original observations ($\sigma_{\text{original}}$) = 4.
Each observation is multiplied by 3.
To Find:
The new mean and new standard deviation.
Solution:
Let the original observations be $x_1, x_2, \dots, x_6$.
The mean of the original observations is $\overline{x}_{\text{original}} = \frac{1}{6} \sum\limits_{i=1}^{6} x_i = 8$.
The standard deviation of the original observations is $\sigma_{\text{original}} = \sqrt{\frac{\sum\limits_{i=1}^{6} (x_i - \overline{x}_{\text{original}})^2}{6}} = 4$.
The new observations $y_i$ are obtained by multiplying each $x_i$ by a constant $k=3$. So, $y_i = 3x_i$ for $i = 1, 2, \dots, 6$.
New Mean:
The new mean ($\overline{y}_{\text{new}}$) is given by:
$\overline{y}_{\text{new}} = \frac{1}{n} \sum\limits_{i=1}^{n} y_i$
Substitute $y_i = 3x_i$ and $n=6$:
$\overline{y}_{\text{new}} = \frac{1}{6} \sum\limits_{i=1}^{6} (3x_i)$
Using the property of summation $\sum\limits k z_i = k \sum\limits z_i$:
$\overline{y}_{\text{new}} = 3 \left( \frac{1}{6} \sum\limits_{i=1}^{6} x_i \right)$
The term in the parenthesis is the original mean $\overline{x}_{\text{original}}$.
$\overline{y}_{\text{new}} = 3 \times \overline{x}_{\text{original}}$
Substitute the given value of $\overline{x}_{\text{original}} = 8$:
$\overline{y}_{\text{new}} = 3 \times 8 = 24$
The new mean is 24.
New Standard Deviation:
The variance of the original observations is $\sigma_{\text{original}}^2 = (\sigma_{\text{original}})^2 = 4^2 = 16$.
The variance of the new observations ($\sigma_{\text{new}}^2$) is given by:
$\sigma_{\text{new}}^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (y_i - \overline{y}_{\text{new}})^2$
Substitute $y_i = 3x_i$ and $\overline{y}_{\text{new}} = 3\overline{x}_{\text{original}}$:
$\sigma_{\text{new}}^2 = \frac{1}{6} \sum\limits_{i=1}^{6} (3x_i - 3\overline{x}_{\text{original}})^2$
Factor out 3 from the term inside the square:
$\sigma_{\text{new}}^2 = \frac{1}{6} \sum\limits_{i=1}^{6} (3(x_i - \overline{x}_{\text{original}}))^2$
Square the term $3(x_i - \overline{x}_{\text{original}})$:
$\sigma_{\text{new}}^2 = \frac{1}{6} \sum\limits_{i=1}^{6} 9(x_i - \overline{x}_{\text{original}})^2$
Factor out the constant 9 from the summation:
$\sigma_{\text{new}}^2 = 9 \left( \frac{1}{6} \sum\limits_{i=1}^{6} (x_i - \overline{x}_{\text{original}})^2 \right)$
The expression in the parenthesis is the original variance $\sigma_{\text{original}}^2$.
$\sigma_{\text{new}}^2 = 9 \times \sigma_{\text{original}}^2$
Substitute the value of $\sigma_{\text{original}}^2 = 16$:
$\sigma_{\text{new}}^2 = 9 \times 16 = 144$
The new variance is 144.
The new standard deviation ($\sigma_{\text{new}}$) is the square root of the new variance:
$\sigma_{\text{new}} = \sqrt{\sigma_{\text{new}}^2} = \sqrt{144} = 12$
Alternate Method using Property:
If each observation $x_i$ is multiplied by a constant $k$, the new mean is $\overline{y} = k\overline{x}$ and the new standard deviation is $\sigma_y = |k|\sigma_x$.
Here, $k=3$.
New mean = $3 \times \text{Original Mean} = 3 \times 8 = 24$.
New standard deviation = $|3| \times \text{Original Standard Deviation} = 3 \times 4 = 12$.
Both methods yield the same result.
The new mean of the resulting observations is 24 and the new standard deviation is 12.
Question 4. Given that $\overline{x}$ is the mean and σ2 is the variance of n observations x1 , x2 , ...,xn . Prove that the mean and variance of the observations ax1 , ax2 , ax3 , ...., axn are a$\overline{x}$ and a2 σ2 , respectively, (a ≠ 0).
Answer:
Given:
A set of $n$ observations: $x_1, x_2, \dots, x_n$.
Mean of these observations = $\overline{x}$.
Variance of these observations = $\sigma^2$.
A new set of observations is created by multiplying each original observation by a non-zero constant 'a', resulting in $y_1 = ax_1, y_2 = ax_2, \dots, y_n = ax_n$.
To Prove:
The mean of the new observations is $a\overline{x}$.
The variance of the new observations is $a^2 \sigma^2$.
Proof for the Mean:
The mean of the original observations is defined as:
$\overline{x} = \frac{1}{n} \sum\limits_{i=1}^{n} x_i$
Let the mean of the new observations be $\overline{y}$. By definition:
$\overline{y} = \frac{1}{n} \sum\limits_{i=1}^{n} y_i$
Substitute $y_i = ax_i$:
$\overline{y} = \frac{1}{n} \sum\limits_{i=1}^{n} (ax_i)$
Using the property of summation $\sum\limits k z_i = k \sum\limits z_i$:
$\overline{y} = a \left( \frac{1}{n} \sum\limits_{i=1}^{n} x_i \right)$
The expression in the parenthesis is the original mean $\overline{x}$.
$\overline{y} = a\overline{x}$
Thus, the mean of the observations $ax_1, ax_2, \dots, ax_n$ is $a\overline{x}$.
Proof for the Variance:
The variance of the original observations is defined as:
$\sigma^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (x_i - \overline{x})^2$
Let the variance of the new observations be $\sigma_y^2$. By definition:
$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (y_i - \overline{y})^2$
Substitute $y_i = ax_i$ and the new mean $\overline{y} = a\overline{x}$ (proved above):
$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (ax_i - a\overline{x})^2$
Factor out 'a' from the term inside the square:
$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (a(x_i - \overline{x}))^2$
Square the term $a(x_i - \overline{x})$:
$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} a^2 (x_i - \overline{x})^2$
Since $a^2$ is a constant (and $a \neq 0$), we can factor it out from the summation:
$\sigma_y^2 = a^2 \left( \frac{1}{n} \sum\limits_{i=1}^{n} (x_i - \overline{x})^2 \right)$
The expression in the parenthesis is the original variance $\sigma^2$.
$\sigma_y^2 = a^2 \sigma^2$
Thus, the variance of the observations $ax_1, ax_2, \dots, ax_n$ is $a^2 \sigma^2$.
We have shown that the mean and variance of the observations $ax_1, ax_2, \dots, ax_n$ are $a\overline{x}$ and $a^2 \sigma^2$, respectively, given that $a \neq 0$.
Question 5. The mean and standard deviation of 20 observations are found to be 10 and 2, respectively. On rechecking, it was found that an observation 8 was incorrect. Calculate the correct mean and standard deviation in each of the following cases:
(i) If wrong item is omitted.
(ii) If it is replaced by 12.
Answer:
Given:
Incorrect number of observations ($n_{\text{incorrect}}$) = 20.
Incorrect mean ($\overline{x}_{\text{incorrect}}$) = 10.
Incorrect standard deviation ($\sigma_{\text{incorrect}}$) = 2.
Incorrect observation recorded = 8.
To Find: The correct mean and standard deviation for two cases.
Solution:
From the incorrect mean, we can find the incorrect sum of observations:
$\overline{x}_{\text{incorrect}} = \frac{\sum\limits x_{\text{incorrect}}}{n_{\text{incorrect}}}$
... (A)
$\sum\limits x_{\text{incorrect}} = \overline{x}_{\text{incorrect}} \times n_{\text{incorrect}}$
$\sum\limits x_{\text{incorrect}} = 10 \times 20 = 200$
The incorrect sum of observations is 200.
From the incorrect standard deviation, we can find the incorrect variance:
$\sigma_{\text{incorrect}}^2 = (\sigma_{\text{incorrect}})^2 = 2^2 = 4$
The formula for variance is $\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\overline{x})^2$.
Using this, we find the incorrect sum of squares:
$\sigma_{\text{incorrect}}^2 = \frac{\sum\limits x^2_{\text{incorrect}}}{n_{\text{incorrect}}} - (\overline{x}_{\text{incorrect}})^2$
... (B)
$4 = \frac{\sum\limits x^2_{\text{incorrect}}}{20} - (10)^2$
$4 = \frac{\sum\limits x^2_{\text{incorrect}}}{20} - 100$
$\frac{\sum\limits x^2_{\text{incorrect}}}{20} = 100 + 4 = 104$
$\sum\limits x^2_{\text{incorrect}} = 104 \times 20 = 2080$
The incorrect sum of squares is 2080.
Case (i) If wrong item is omitted:
The incorrect observation (8) is removed from the data.
New number of observations ($n_{\text{new}}$) = $n_{\text{incorrect}} - 1 = 20 - 1 = 19$
Correct sum of observations ($\sum\limits x_{\text{correct}}$) = Incorrect sum of observations - Incorrect observation
$\sum\limits x_{\text{correct}} = 200 - 8 = 192$
Calculate the correct mean:
$\overline{x}_{\text{correct}} = \frac{\sum\limits x_{\text{correct}}}{n_{\text{new}}}$
$\overline{x}_{\text{correct}} = \frac{192}{19}$
The correct mean is $\frac{192}{19}$.
Correct sum of squares ($\sum\limits x^2_{\text{correct}}$) = Incorrect sum of squares - (Incorrect observation)$^2$
$\sum\limits x^2_{\text{correct}} = 2080 - 8^2 = 2080 - 64 = 2016$
Calculate the correct variance:
$\sigma_{\text{correct}}^2 = \frac{\sum\limits x^2_{\text{correct}}}{n_{\text{new}}} - (\overline{x}_{\text{correct}})^2$
$\sigma_{\text{correct}}^2 = \frac{2016}{19} - \left(\frac{192}{19}\right)^2$
$\sigma_{\text{correct}}^2 = \frac{2016}{19} - \frac{36864}{361}$
To combine the fractions, find a common denominator (361):
$\sigma_{\text{correct}}^2 = \frac{2016 \times 19}{361} - \frac{36864}{361} = \frac{38304}{361} - \frac{36864}{361} = \frac{1440}{361}$
Calculate the correct standard deviation:
$\sigma_{\text{correct}} = \sqrt{\sigma_{\text{correct}}^2} = \sqrt{\frac{1440}{361}} = \frac{\sqrt{144 \times 10}}{\sqrt{361}} = \frac{12\sqrt{10}}{19}$
Using $\sqrt{10} \approx 3.162$: $\sigma_{\text{correct}} \approx \frac{12 \times 3.162}{19} \approx \frac{37.944}{19} \approx 1.997$
The correct mean is $\frac{192}{19} \approx 10.11$ and the correct standard deviation is $\frac{12\sqrt{10}}{19} \approx 1.997$ if the wrong item is omitted.
Case (ii) If it is replaced by 12:
The incorrect observation (8) is replaced by the correct observation (12).
New number of observations ($n_{\text{new}}$) = $n_{\text{incorrect}} = 20$ (since one item is replaced, the number of observations remains the same).
Correct sum of observations ($\sum\limits x_{\text{correct}}$) = Incorrect sum of observations - Incorrect observation + Correct observation
$\sum\limits x_{\text{correct}} = 200 - 8 + 12 = 204$
Calculate the correct mean:
$\overline{x}_{\text{correct}} = \frac{\sum\limits x_{\text{correct}}}{n_{\text{new}}}$
$\overline{x}_{\text{correct}} = \frac{204}{20} = 10.2$
The correct mean is 10.2.
Correct sum of squares ($\sum\limits x^2_{\text{correct}}$) = Incorrect sum of squares - (Incorrect observation)$^2$ + (Correct observation)$^2$
$\sum\limits x^2_{\text{correct}} = 2080 - 8^2 + 12^2 = 2080 - 64 + 144 = 2080 + 80 = 2160$
Calculate the correct variance:
$\sigma_{\text{correct}}^2 = \frac{\sum\limits x^2_{\text{correct}}}{n_{\text{new}}} - (\overline{x}_{\text{correct}})^2$
$\sigma_{\text{correct}}^2 = \frac{2160}{20} - (10.2)^2$
$\sigma_{\text{correct}}^2 = 108 - 104.04$
$\sigma_{\text{correct}}^2 = 3.96$
Calculate the correct standard deviation:
$\sigma_{\text{correct}} = \sqrt{\sigma_{\text{correct}}^2} = \sqrt{3.96}$
Using a calculator, $\sqrt{3.96} \approx 1.98997$
The correct standard deviation is approximately 1.99.
The correct mean is 10.2 and the correct standard deviation is $\sqrt{3.96} \approx 1.99$ if the wrong item is replaced by 12.
Question 6. The mean and standard deviation of marks obtained by 50 students of a class in three subjects, Mathematics, Physics and Chemistry are given below:
| Subject | Mathematics | Physics | Chemistry |
| Mean | 42 | 32 | 40.9 |
| Standard deviation | 12 | 15 | 20 |
Which of the three subjects shows the highest variability in marks and which shows the lowest?
Answer:
Given:
The mean and standard deviation for the marks of 50 students in three subjects are given in the table:
| Subject | Mean ($\overline{x}$) | Standard Deviation ($\sigma$) |
| Mathematics | 42 | 12 |
| Physics | 32 | 15 |
| Chemistry | 40.9 | 20 |
To Find:
We need to find which subject shows the highest variability in marks and which shows the lowest.
Solution:
To compare the variability of two or more data sets with different means, we use the Coefficient of Variation (CV).
The formula for Coefficient of Variation is:
$CV = \frac{\text{Standard Deviation}}{\text{Mean}} \times 100\%$
Let's calculate the Coefficient of Variation for each subject:
Coefficient of Variation for Mathematics:
$CV_{Math} = \frac{\sigma_{Math}}{\overline{x}_{Math}} \times 100\%$
$CV_{Math} = \frac{12}{42} \times 100\%$
$CV_{Math} = \frac{2}{7} \times 100\%$
$CV_{Math} \approx 0.2857 \times 100\%$
$CV_{Math} \approx 28.57\%$
Coefficient of Variation for Physics:
$CV_{Physics} = \frac{\sigma_{Physics}}{\overline{x}_{Physics}} \times 100\%$
$CV_{Physics} = \frac{15}{32} \times 100\%$
$CV_{Physics} = 0.46875 \times 100\%$
$CV_{Physics} = 46.875\%$
$CV_{Physics} \approx 46.88\%$
Coefficient of Variation for Chemistry:
$CV_{Chemistry} = \frac{\sigma_{Chemistry}}{\overline{x}_{Chemistry}} \times 100\%$
$CV_{Chemistry} = \frac{20}{40.9} \times 100\%$
$CV_{Chemistry} = \frac{2000}{40.9}\%$
$CV_{Chemistry} \approx 48.90\%$
Comparing the Coefficients of Variation:
$CV_{Math} \approx 28.57\%$
$CV_{Physics} \approx 46.88\%$
$CV_{Chemistry} \approx 48.90\%$
A higher Coefficient of Variation indicates greater variability.
The highest Coefficient of Variation is for Chemistry ($48.90\%$).
The lowest Coefficient of Variation is for Mathematics ($28.57\%$).
Therefore, Chemistry shows the highest variability in marks, and Mathematics shows the lowest variability.
Question 7. The mean and standard deviation of a group of 100 observations were found to be 20 and 3, respectively. Later on it was found that three observations were incorrect, which were recorded as 21, 21 and 18. Find the mean and standard deviation if the incorrect observations are omitted.
Answer:
Given:
Number of observations, $n_{old} = 100$
Old Mean, $\overline{x}_{old} = 20$
Old Standard Deviation, $\sigma_{old} = 3$
Incorrect observations are 21, 21, and 18.
To Find:
The new mean and standard deviation after omitting the incorrect observations.
Solution:
We know that the mean is given by $\overline{x} = \frac{\sum\limits x_i}{n}$.
The sum of the old observations is $\sum\limits x_{old} = n_{old} \times \overline{x}_{old}$.
$\sum\limits x_{old} = 100 \times 20 = 2000$
The incorrect observations are 21, 21, and 18.
The sum of incorrect observations $= 21 + 21 + 18 = 60$.
The correct sum of the remaining observations is $\sum\limits x_{new} = \sum\limits x_{old} - \text{Sum of incorrect observations}$.
$\sum\limits x_{new} = 2000 - 60 = 1940$
The new number of observations is $n_{new} = n_{old} - \text{Number of incorrect observations}$.
$n_{new} = 100 - 3 = 97$
The new mean is $\overline{x}_{new} = \frac{\sum\limits x_{new}}{n_{new}}$.
$\overline{x}_{new} = \frac{1940}{97} = 20$
$\overline{x}_{new} = 20$
... (i)
Now, we need to find the new standard deviation. The formula for standard deviation is $\sigma = \sqrt{\frac{\sum\limits x_i^2}{n} - \overline{x}^2}$.
Squaring the standard deviation, we get the variance: $\sigma^2 = \frac{\sum\limits x_i^2}{n} - \overline{x}^2$.
From the old data, we have $\sigma_{old}^2 = \frac{\sum\limits x_{old}^2}{n_{old}} - \overline{x}_{old}^2$.
$3^2 = \frac{\sum\limits x_{old}^2}{100} - 20^2$
$9 = \frac{\sum\limits x_{old}^2}{100} - 400$
$9 + 400 = \frac{\sum\limits x_{old}^2}{100}$
$409 = \frac{\sum\limits x_{old}^2}{100}$
$\sum\limits x_{old}^2 = 409 \times 100 = 40900$
The sum of squares of incorrect observations is $21^2 + 21^2 + 18^2 = 441 + 441 + 324 = 1206$.
The correct sum of squares of the remaining observations is $\sum\limits x_{new}^2 = \sum\limits x_{old}^2 - \text{Sum of squares of incorrect observations}$.
$\sum\limits x_{new}^2 = 40900 - 1206 = 39694$
Now we can calculate the new variance, $\sigma_{new}^2$, using the new sum of squares and the new mean.
$\sigma_{new}^2 = \frac{\sum\limits x_{new}^2}{n_{new}} - \overline{x}_{new}^2$
$\sigma_{new}^2 = \frac{39694}{97} - 20^2$
$\sigma_{new}^2 = \frac{39694}{97} - 400$
$\sigma_{new}^2 = \frac{39694 - 400 \times 97}{97}$
$400 \times 97 = 38800$
$\sigma_{new}^2 = \frac{39694 - 38800}{97}$
$\sigma_{new}^2 = \frac{894}{97}$
$\sigma_{new}^2 = \frac{894}{97}$
... (ii)
Finally, the new standard deviation is $\sigma_{new} = \sqrt{\sigma_{new}^2}$.
$\sigma_{new} = \sqrt{\frac{894}{97}}$
$\sigma_{new} \approx \sqrt{9.216494845}$
$\sigma_{new} \approx 3.035866$
Rounding to two decimal places, $\sigma_{new} \approx 3.04$.
$\sigma_{new} \approx 3.04$
... (iii)
Thus, the new mean is 20 and the new standard deviation is approximately 3.04.